Cell 


Profiling SARS-CoV-2 HLA-I peptidome reveals T cell 
epitopes from out-of-frame ORFs 


Graphical abstract Authors 


Shira Weingarten-Gabbay, 

Susan Klaeger, Siranush Sarkizova, ..., 
Jennifer G. Abelin, Mohsan Saeed, 
Pardis C. Sabeti 


Correspondence 


shirawg@broadinstitute.org (S.W.-G.), 
sklaeger@broadinstitute.org (S.K.), 
msaeed1 @bu.edu (M.S.) 


In brief 

Analysis of the HLA-I peptidome of SARS- 
CoV-2 infection identifies peptides 
derived from canonical and out-of-frame 
ORFs in viral S and N protein that are not 


SS 
10 
ORF1A>ORF1 B> S 8aE>M>>>>8> » captured by current vaccines and yield 
| AN potent T cell responses in a mouse model 
Overlapping out-of-frame ORFs as well as individuals with COVID-19. 


N (Nucleocapsid 
ORFS.iORF 1 ORF9b 





Highlights 
e Time course analysis of HLA-I immunopeptidome in SARS- 
CoV-2-infected cells 


e 25% of detected HLA-I peptides originated from out-of- 
frame ORFs in S and N 


e Some out-of-frame peptides elicited stronger T cell 
responses than canonical peptides 


e Early expressed viral proteins dominated HLA-l presentation 
and immunogenicity 


e Weingarten-Gabbay et al., 2021, Cell 184, 3962-3980 
| ~<A | July 22, 2021 © 2021 The Authors. Published by Elsevier Inc. 5 
https://doi.org/10.1016/j.cell.2021.05.046 gə CelPress 





© CellPPress Cell 


Profiling SARS-CoV-2 HLA-I peptidome 
reveals T cell epitopes from out-of-frame ORFs 


Shira Weingarten-Gabbay, ':2:27-22* Susan klaeger, "77." Siranush Sarkizova,!.^/ Leah R. Pearlman,! Da-Yuan Chen,*:# 
Kathleen ME Gallagher,» Matthew R. Bauer," Hannah B. Taylor,! W. Augustine Dunn," Christina Tarr,? John Sidney." 
Suzanna Rachimi,! Hasahn L. Conway,** Katelin Katsis,? Yuntong Wang," Del Leistritz-Edwards,? Melissa R. Durkin,® 
Christopher H. Tomkins-Tinch,'-? Yaara Finkel,'° Aharon Nachshon,'? Matteo Gentili,! Keith D. Rivera,' Isabel P. Carulli,! ' 
Vipheaviny A. Chea,!! Abishek Chandrashekar,'? Cansu Cimen Bozkus,'? Mary Carrington, "77" MGH COVID-19 
Collection & Processing Team,'9 Nina Bhardwaj,'? Dan H. Barouch,®12:14.17 Alessandro Sette,?:!? Marcela V. Maus,°:® !2-2° 
Charles M. Rice,?! Karl R. Clauser,! Derin B. Keskin,!:! ::??.?? Daniel C. Pregibon,? Nir Hacohen,!1:20:24:28 Steven A. Carr, .?? 
Jennifer G. Abelin,':2* Mohsan Gaeed 77.72" and Pardis C. Sabeti!.?:! /.25.26.28 


'Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA 

?Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA 

3Department of Biochemistry, Boston University School of Medicine, Boston, MA, USA 

^National Emerging Infectious Diseases Laboratories, Boston University, Boston, MA, USA 

?Cellular Immunotherapy Program and Cancer Center, Massachusetts General Hospital, Charlestown, MA 02129, USA 
eHarvard Medical School, Boston, MA 02115, USA 

“Harvard Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA 02115, USA 

SRepertoire Immune Medicines, Cambridge, MA 02139, USA 

?Center for Infectious Disease and Vaccine Research, La Jolla Institute for Immunology (LJI), La Jolla, CA 92037, USA 
10Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel 

™Translational Immunogenomics Laboratory, Dana-Farber Cancer Institute, Boston, MA, USA 

12Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, MA, USA 

13Department of Hematology and Medical Oncology, Icahn School of Medicine at Mount Sinai Hospital, New York, NY, USA 
14Ragon Institute of MGH, MIT and Harvard, Cambridge, MA, USA 

15Basic Science Program, Frederick National Laboratory for Cancer Research in the Laboratory of Integrative Cancer Immunology, National 
Cancer Institute, Bethesda, MD, USA 

16Massachusetts General Hospital, Harvard Medical School, Boston, MA 02115, USA 

17Massachusetts Consortium on Pathogen Readiness, Boston, MA, USA 

18Department of Medicine, Division of Infectious Diseases and Global Public Health, University of California, San Diego (UCSD), La Jolla, 
CA 92037, USA 

19Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA 


(Affiliations continued on next page) 


SUMMARY 


T cell-mediated immunity plays an important role in controlling SARS-CoV-2 infection, but the repertoire of 
naturally processed and presented viral epitopes on class | human leukocyte antigen (HLA-I) remains unchar- 
acterized. Here, we report the first HLA-I immunopeptidome of SARS-CoV-2 in two cell lines at different times 
post infection using mass spectrometry. We found HLA-I peptides derived not only from canonical open 
reading frames (ORFs) but also from internal out-of-frame ORFs in spike and nucleocapsid not captured 
by current vaccines. Some peptides from out-of-frame ORFs elicited T cell responses in a humanized mouse 
model and individuals with COVID-19 that exceeded responses to canonical peptides, including some of the 
strongest epitopes reported to date. Whole-proteome analysis of infected cells revealed that early expressed 
viral proteins contribute more to HLA-I presentation and immunogenicity. These biological insights, as well as 
the discovery of out-of-frame ORF epitopes, will facilitate selection of peptides for immune monitoring and 
vaccine development. 


INTRODUCTION disease 2019 (COVID-19) pandemic (Lu et al., 2020), it is critical 

to decipher how infected host cells interact with the immune 
As efforts continue to develop effective vaccines and therapeu- ` system. Previous insights from SARS-CoV and Middle East res- 
tic agents against severe acute respiratory syndrome coronavi- piratory syndrome (MERS)-CoV as well as emerging evidence 
rus 2 (SARS-CoV-2), the virus causing the ongoing coronavirus from SARS-CoV-2 imply that T cell responses play an essential 





3962 Cell 184, 3962-3980, July 22, 2021 © 2021 The Authors. Published by Elsevier Inc. e 
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). 


Cell 


€? CelPress 


?0Department of Medicine, Massachusetts General Hospital, Boston, MA 02114, USA 

?'Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY 10065, USA 

?? Health Informatics Lab, Metropolitan College, Boston University, Boston, MA, USA 

23Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA 

24Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA 

25Department of Immunology and Infectious Disease, Harvard T.H. Chan School of Public Health, Boston, MA, USA 


?6Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA 
?7These authors contributed equally 

28Senior author 

?9| ead contact 


*Correspondence: shirawg@broadinstitute.org (S.W.-G.), sklaeger@broadinstitute.org (S.K.), msaeed1 @bu.edu (M.S.) 


https://doi.org/10.1016/j.cell.2021.05.046 


role in SARS-CoV-2 immunity and viral clearance (Altmann and 
Boyton, 2020; Grifoni et al., 2020a; Le Bert et al., 2020; Rydyzn- 
ski Moderbacher et al., 2020; Sekine et al., 2020). Growing con- 
cerns about emerging viral variants and potential resistance to 
antibody defenses have spurred renewed discussions about 
other immune responses and, in particular, cytotoxic T cells 
(Ledford, 2021). When viruses infect cells, their proteins are pro- 
cessed and presented on the host cell surface by class | human 
leukocyte antigen (HLA-l). Circulating cytotoxic T cells recog- 
nize the presented foreign antigens and initiate an immune 
response, resulting in clearance of infected cells. Investigating 
the repertoire of SARS-CoV-2-derived HLA-I peptides will 
enable identification of viral epitopes responsible for activation 
of cytotoxic T cells. 

Most studies that have interrogated the interaction between 
T cells and SARS-CoV-2 antigens to date utilized overlapping 
peptide tiling approaches and/or bioinformatics predictions of 
HLA-I binding (Campbell et al., 2020; Ferretti et al., 2020; Grifoni 
et al., 2020b; Nguyen et al., 2020; Poran et al., 2020; Saini et al., 
2020; Tarke et al., 2020). Although HLA-I binding prediction is 
undoubtealy a useful tool to identify putative antigens, it has lim- 
itations. First, antigen processing and presentation is a multi- 
step biological pathway that includes source protein degrada- 
tion by the proteasome, peptide cleavage by aminopeptidases, 
translocation into the endoplasmic reticulum (ER), and HLA-I 
binding (Neefjes et al., 2011). Although many computational pre- 
dictors now account for some of these steps, the average posi- 
tive predictive value achieved across HLA alleles is still ~64% 
(Sarkizova et al., 2020). Second, prediction models do not ac- 
count for ways in which viruses may manipulate cellular pro- 
cesses that affect antigen presentation. For example, viruses 
can attenuate translation of host proteins, downregulate protea- 
some machinery, and interfere with HLA-l expression (Hansen 
and Bouvier, 2009; Sonenberg and Hinnebusch, 2009). These 
changes shape the collection of viral and human-derived HLA-I 
peptides presented to the immune system. Third, prediction 
models do not capture the dynamics of viral protein expression 
during the course of infection. Kinetics studies in vaccinia and 
influenza viruses have shown that HLA-I presentation of viral epi- 
topes can peak 3.5-9.5 h post infection (hpi) (Croft et al., 2013; 
Wu et al., 2019). Moreover, because viruses can suppress 
HLA-I presentation, proteins that are expressed earlier in the vi- 
rus life cycle may contribute more to the repertoire of viral epi- 
topes. In light of these limitations, experimental measurements 
of naturally presented peptides upon infection can deepen our 
understanding of T cell responses to SARS-CoV-2. 


Mass spectrometry (MS)-based HLA-I immunopeptidomics is 
a direct and untargeted method to discover endogenously pre- 
sented peptides (Abelin et al., 2017; Bassani-Sternberg and 
Gfeller, 2016; Chong et al., 2018; Sarkizova et al., 2020). This 
technology has facilitated detection of virus-derived HLA-| pep- 
tides for West Nile virus, vaccinia virus, human immunodefi- 
ciency virus (HIV), human cytomegalovirus (HCMV), and measles 
virus (Croft et al., 2013; Erhard et al., 2018; McMurtrey et al., 
2008; Rucevic et al., 2016; Schellens et al., 2015; Ternette 
et al., 2016). These infectious disease studies revealed new an- 
tigens, characterized the kinetics of presented peptides during 
infection, and identified viral peptide sequences that activate 
T cell responses. 

Identifying viral protein sequences from MS data commonly 
relies on matching spectra against a database of known viral 
open reading frames (ORFs) and has largely focused on canon- 
ical ORFs. Over the past decade, genome-wide profiling of 
translated sequences has revealed a striking number of non-ca- 
nonical ORFs in mammalian and viral genomes (Finkel et al., 
2020a; Ingolia et al., 2009, 2011; Stern-Ginossar et al., 2012). 
Although the function of most of these non-canonical ORFs re- 
mains unknown, it is becoming clear that the translated polypep- 
tides serve as fruitful substrates for the antigen presentation ma- 
chinery in viral infection, uninfected cells, and cancer (Chen 
et al., 2020b; Hickman et al., 2018; Ingolia et al., 2014; Maness 
et al., 2010; Ouspenskaia et al., 2020; Ruiz Cuevas et al., 2021; 
Starck and Shastri, 2016; Yang et al., 2016). Importantly, a recent 
study identified 23 unannotated ORFs in the genome of SARS- 
CoV-2, some of which have higher expression levels than the 
canonical viral ORFs (Finkel et al., 2020b). Whether these non- 
canonical ORFs give rise to HLA-I-bound peptides remains 
unknown. 

Here we present the first examination of the HLA-I immunopep- 
tidome in two SARS-CoV-2-infected human cell lines and com- 
plement this analysis with RNA sequencing (RNA-seq) and global 
proteomics measurements. We identify viral HLA-I peptides that 
are derived from canonical and non-canonical ORFs and monitor 
the dynamics of viral protein expression and peptide presenta- 
tion over multiple time points post infection. We show that pep- 
tides derived from out-of-frame ORFs elicit T cell responses in 
immunized mice and individuals with COVID-19 using ELISpot 
and multiplexed barcoded tetramer assays combined with sin- 
gle-cell sequencing. Whole-proteome measurements suggest 
that the time of viral protein expression correlates with HLA-I pre- 
sentation and immunogenicity and that SARS-CoV-2 interferes 
with the cellular proteasomal pathway, potentially resulting in 


Cell 184, 3962-3980, July 22, 2021 3963 








e CelPress 


lower presentation of viral peptides. Computational predictions 
and biochemical binding assays demonstrate that the detected 
HLA-I peptides can be presented by additional HLA-I alleles 
beyond the nine alleles tested in our study. Our findings can 
inform future immune monitoring assays in affected individuals 
and aid in the design of efficacious vaccines. 


RESULTS 


Profiling HLA-I peptides in SARS-CoV-2-infected cells 

by MS 

To interrogate the repertoire of human and viral HLA-I peptides, 
we immunoprecipitated (IP) HLA-I proteins from SARS-CoV-2- 
infected human lung A549 cells and HEK293T cells that were 
transduced to stably express ACE2 and TMPRSS2, two known 
viral entry factors. We then analyzed their HLA-bound peptides 
by liquid chromatography-tandem MS (LC-MS/MS) (Figure 1A). 
We also analyzed the whole proteome of the IP flowthrough by 
LC-MS/MS and performed RNA-seq to examine the effect of 
SARS-CoV-2 on human gene expression. To allow detection of 
peptides from the complete translatome of SARS-CoV-2, we 
combined the recently identified 23 ORFs (Finkel et al., 2020b) 
with the list of canonical ORFs and the human RefSeq database 
for LC-MS/MS data analysis. 

When choosing cell types for this study, we focused on 
achieving biological relevance and high HLA-I allelic coverage. 
A549 cells are lung carcinoma cells and represent the key biolog- 
ical target of SARS-CoV-2; thus, they are commonly used 
in COVID-19 studies. HEK293T cells endogenously express 
HLA-A*02:01 and B'07:02, two high-frequency HLA-I alleles. 
Together, the nine HLA-I alleles expressed by HEK293T and 
A549 cells cover at least one allele in 63.8% of the human pop- 
ulation (Figure 1B; STAR Methods). Using immunofluorescence 
staining of the nucleocapsid protein, we evaluate that ~70% of 
the transduced cells were infected at the peak infection time 
(Figure S1). 

We validated the technical performance of our assays by 
examining the overall characteristics of presented HLA peptides. 
We identified 5,837 and 6,372 HLA-bound 8- to 11-mer peptides 
in uninfected and infected (24 hpi) A549 cells and 4,281 and 
1,336 unique peptides in HEK293T cells, respectively (Table 
S1). The reduction in the total number of peptides after infection 
in HEK293T cells is likely due to cell death (~50% of cells 24 hpi). 
As expected, peptide length distribution was not influenced by 
infection, and the majority of HLA-I peptides were 9-mers (Fig- 
ure 1C). Next we compared the binding motifs of all 9-mer pep- 
tides between uninfected and infected cells per cell line and per 
individual HLA allele (Figure 1D; Figures S2A and S2B). We did 
not find major differences following infection, and the observed 
amino acids at the main anchor positions 2 and 9 were in line 
with the expected binding motifs of the alleles expressed in the 
two cell lines. 

To evaluate whether the MS-detected peptides were indeed 
predicted to bind to the expressed HLA-I alleles, we inferred 
the most likely allele to which each peptide binds using HLA- 
thena (Sarkizova et al., 2020). At a stringent cutoff of predicted 
percentile rank of 0.5 or less, 8796 of A549 and 7396 of 
HEK293T cell identified peptides post infection were assigned 


3964 Cell 184, 3962-3980, July 22, 2021 


Cell 


to at least one of the alleles in the corresponding cell line (Fig- 
ure 1E; Figure S2C). Differences in the relative representation 
of HLA alleles on the cell surface are influenced by the expres- 
sion level as well as the permissiveness of the binding motif of 
each allele (Figures S2D and S2E). 


SARS-CoV-2 HLA-I peptides 

Next we examined HLA-l peptides that are derived from the 
SARS-CoV-2 genome (Figure 2A; Table S1). We identified 28 
peptides from canonical proteins (non-structural protein 1 
[nsp1], nsp2, nsp3, nsp5, nsp8, nsp10, nsp14, nsp15, spike 
(S), M, ORF7a, and nucleocapsid [N]). Strikingly, 9 peptides 
were derived from out-of-frame ORFs in S and N. Four peptides 
matched to an in-silico six-frame translation database of the 
SARS-CoV-2 genome. However, manual inspection of ribosome 
profiling data (Finkel et al., 2020b) did not support translated 
ORFs in these regions. Most of the HL AT peptides were de- 
tected in more than one experiment and predicted as good 
binders by HLAthena (%rank < 2) to at least one of the expressed 
HLA alleles. We confirmed binding for 19 of the 20 HLA-I pep- 
tides predicted to be presented by four HLA alleles expressed 
in A549 and HEK293T cells (A*02:01, B*07:02, B*18:01, and 
B*44:03) using biochemical binding assays (IC5o < 500 nM; Fig- 
ure 2B; Table S2). One peptide, HADQLTPTW, was also de- 
tected in non-infected A549 cells; thus, we removed it from all 
subsequent MS analyses. 

Surprisingly, we detected only one HLA-I peptide from N: a 
SARS-CoV-2 protein expected to be highly abundant based on 
previous RNA-seq and ribosome profiling (Ribo-seq) studies 
(Finkel et al., 2020b; Kim et al., 2020). To test whether this low 
representation could be explained by lower expression of N in 
our experiment, we examined the whole-proteome MS data. 
We found a strong correlation between the abundance of viral 
proteins in the proteome of the two cell lines (Pearson R = 
0.91; Figure 2C) and with recently published translation mea- 
surements in infected Vero cells (Finkel et al., 2020b) (Pearson 
R = 0.86 and R = 0.78 for A549 and HEK293T, respectively; Fig- 
ures 2D and 2E; Table S3A). The N protein remained the most 
abundant viral protein in both cell lines. 

An alternative hypothesis for lower N representation could be 
that the protein harbors fewer peptides compatible with the HLA 
binding motifs. Therefore, for each SARS-CoV-2 ORF, we 
computed the ratio between the number of peptides that are pre- 
dicted to be presented by at least one of the HLA-I alleles in each 
cell line and the number of total 8- to 11-mers. Notably, N had 
fewer than expected presentable peptides than most SARS- 
CoV-2 proteins in both cell lines (Figures 2F and 2G; Table 
S3B). We then expanded our analysis to 92 HLA-I alleles with 
high population coverage and with immunopeptidome-trained 
predictors (Sarkizova et al., 2020; Figure 2H; Table S3B). This 
analysis also categorized N among the least presentable canon- 
ical proteins of SARS-CoV-2. Our results hint that N might be less 
presented than expected, given its high expression level in in- 
fected cells (~10-fold greater than the next most abundant viral 
protein; Figure 2C). 

Our deep coverage of the viral proteins in the whole-proteome 
analysis (24 proteins) allowed us to observe several interesting 
findings. Although the translation of ORF1a and 1ab, the source 


Cell © CelPress 


OPEN ACCESS 




































































































































































































































































































































































































































































































EE. frequency coverage 
D 
e B 4 
1 SARS-CoV-2/ / a- 
Va |" V [endogenous \ eas 
| xm \ [proteins va 
rid \ —* IS, peptides 
coverage 
| B | 3 | 2057% 
HLA-l immunopeptidome Whole proteome RNA-Seq 
| ABC | 9 | 63845 
D my C - lll A549 *SARS-CoV-2 
i E A549 
de 5 E HEK293T+SARS-CoV-2 
Iz E HEK293T 
S 04 
Kai 
Protein & Peptide "D 
Identification (1.5 % FDR) Bo 2 
Human Annotated Noncanonical # unique Ge l 
Proteome SARS-CoV-2 SARS-CoV-2 peptides 
ORFs ORFs b 
) | | 0.0 [= 
> <8 8 9 1011 >11 
m/z Protein Database Search 
Length 
E 
D n=5,837 n=6,173 n=4,281 n=1,336 
A549 Sp ae 1.00 1.00 
A Ge | n-3034 V o 
1.0 z o 
s E vg d 
2 g A 
8 V L Se L 9 8 0.75 0.75 
eebe [SUN omy B > es din 
=> SSS Ss = SS SS Si =D EE. mI I—— Deg 
0 DÉI <== -EE = = == == mI E 0 0= =" = == = = 8 9 0 50 0.50 
v 2 4 5 67 8 9 “41 2 3 4 5 67 8 9 og 
1.5 5 = 
A549* SARS-CoV-2 FF 20 HEK293T+SARS-CoV-2 c 8 
n=3766 n=936 S 0.25 0.25 
1.0 Y 1.5 | © 
EN J : 
ea ta 
1.0 
ose V. Bä = LP = 0.00 0.00 
SY = e Abo  A549* HEK293T  HEK293T* 
0.0== === SS SSS mm 0. === SS MM SARS-CoV-2 SARS-CoV-2 
uo A3 ZR E. AT REI a X B aen W B1801 Mc1203 Bun C unknown lll 80702 
BE Acidic ES Basic E Hydrophobic JJ Neutral ES Polar E A3001 E B4403 ES C1601 | Multiple E A0201 B Co702 


Figure 1. HLA-I peptidome and whole-proteome measurements in SARS-CoV-2-infected cells 


A) Schematic of the experiment and the antigen presentation pathway. 


B 
C) Length distribution of HLA peptides in infected and naive cells. 
D) Motif of 9-mer sequences identified in infected and naive cells. 


( 
( 
( 
( 
( 
See also Figures S1 and S2 and Table S1. 


polyproteins of nsps1-nsp16, is 10- to 1,000-fold lower than the 
structural ORFs (Finkel et al., 2020b), we found that the abun- 
dance of some nsps was comparable with that of structural pro- 
teins (e.g., nsp1 and nsp8; Figure 2C). Interestingly, although 
nspl-nsp11 were cleaved post-translationally from the same 
polyproteins, their expression levels were variable. This finding 
is consistent with two additional proteomics studies of SARS- 
CoV-2-infected cells utilizing different detergents in their lysis 


) Population frequency of the 9 endogenous HLA-I alleles expressed in A549 and HEK293T cells. 


E) Fraction of observed peptides assigned to alleles using HLAthena prediction (96rank cutoff < 0.5) for the immunopeptidome of infected and uninfected cells. 


buffers (Schmidt et al., 2020; Stukalov et al., 2020), suggesting 
that the observed differences in expression are not due to deter- 
gent solubility. Moreover, nsp12-nsp15, which originate from 
polyprotein 1ab downstream to the frameshift signal, are, as ex- 
pected, expressed at lower levels. Another observation is 
that the S protein appeared as an outlier in both cell lines with 
higher expression in the proteome data compared with Ribo- 
seq measurements, suggesting that it may undergo positive 


Cell 184, 3962—3980, July 22, 2021 3965 


© Cell?ress Cell 


OPEN ACCESS 





A B Canonical Exp (Ribo-seq, Finkel et al.) 


2 
B Non-canonical 10 | mm 106 





1 2 3 4 5 6789-11 = uu red 
L—— 12 13 14 15 16 Ir al 


Canonical ORFs Ir B———— —— Vv —— 
(Refseq NC. 045512) ORF1a ORF1b 5 "er: 
Non-canonical ORFs | | | L | || ll | 
(Finkel et al.) 





HLA-IP HEK293T 


HLA-IP A549 

Proteome RA WI) A0 LUE MEL LU E E UE TL IU AA IM 
HEK293T 24hpi | 
Proteome Wir" E A PIE TIMI TAI HELLE 1 NUI 
A549 24hpi 


0 5000 10000 15000 20000 25000 30000 
Position on SARS-CoV-2 genome (nt) 


HLA predictions 
HEK293T («0.5) 


HLA predictions 
A549 («0.5) 


asa a Oe so Oo 








D 
O 


D 


m 


















































































© = Pearson R = 0.91 Noo Pearson R = 0.86 e Pearson R = 0.78 N.= 
kej 9/9 3/3 3/3 . IS] N 
2 #100 g 11} Spearman R = 0.90 D —1 r Pearson R without S = 0.99 < 11+ Pearson R without S = 0.92 .. 
So oO O = ae 
ee = < e t 
9 80 ud 510 s e: m$ 10 = 40 s — ` t" 
oe S r SW  Zosra ~ e e (77 ORF9b 
Q = nsp2 nsp8 D ee 
c o 60 im P oRF7a 9 g em 9 ORFS. ORF7a 
Do o Or PA d e nsp5 D o a 
p Qt CN nsp10@@ ig "SPI P t N E 
2 x P n Qo X 7e. 8 
o o 40 ul nsp15 ~- ORF6® 8r i 8 . ORES eM 
oS 8. ne ` mpra [ON @nsp1 "oppe 2 di 6 ORF9b 
to o 2s eM o ORF7a ^ nsp12 5 eM o es 
ED E AA @ ORFS @nsp2 @nspts| P 7 @ ORF9b E 7 6 ORF3a 
c 20 m Get @nsp5 @nsp9  Onsp6 5 es a E 6 ORF7a 
9 g = 7r ® es ^nsp3 nspi3| = eoR3| Z ORF8 
25 9 nsp14 @ORF3a @nspi0 9nspi4| A @orF7al = © ORF6 
S S 0 D. @nsp8 — C nsp4 6 © ORF6 A 6 E 
= S = X e 7 8 9 10 11 4.0 4.5 5.0 5.5 4.0 4.5 5.0 5.5 
Ss om Se XM Proteome A549 og, (iBAQ) Ribo-seq Vero cells log,,(RPKM) Ribo-seq Vero cells log, (RPKM) 
< a ea ea 
F G H 
4 j 4 i j mg P VE 
50. A549 ! gg D geg 50-4 HEK293 P ggo 3a.iORF1 (11/130) ES REI S ] SSS 
MORE (1/ aj =E 
- SSES Sen ES Dën GE 
= N.ORF2 (10/326) d $ ] tbe 
> exl VC Wa (2/906). N.iOk ] SS. 
t ol De E feie) "ORB —REEX3— — 
8 40 ORPSa EYE 40 NORE! (1035) nests WT Aa 
> 5012 (12113694 nop (45/1996). Ve i BS o vn DEE 
2 es (85017748) a RA “(orate ORF | 7 kam DE BEE mmm 
3 b Care i- -F 
ae Deg 
S em nsp13 (39/2370) ORF ia } - Er Zeg 
S 30 ORF7a (13/450) 30 Kä foam) nspg (1 — I 
D | GAS | ORF a 17/1000) E | ooo 
d D : nsp13 | 101 = 
a nsp13 (63/2370 nsp9 (6/418) —tri— 
> Lj i 1/38) ORF tab Zeg "i P ] 
: 5m SES VA Ge 
o i n 
Z al — EE 2| SRE mer) Gë 
9 ORF8 (9/450) nsp2 (29/2518 nsp 4 ] Cdh—À— Nu . 
S Kä SS 1avORF2 bn (1 bey E ] kk ucleocapsid 
nsp A n f | ] 
9 e MED em Nucleocapsid rants MAE) Nucleocapsid 161 T ) ] ! 
o nsp9 IS , nsp10 (2/522) ORE? 4 —— 
= ` nsp5 (15/1190 EIORF (0/1 D IRRE 1 | 
x 104 BER 104 e ee &JORF Wäi Esch 
[- TbiORF4 (0/46) TbiORF1 (0/46 jau REZ ent J A 
£ 3a ORF2 (0/98. fa ORFS (oat) "Too : i 
Us Zeg — SC2 protein SNORE? ONS —— SC2 protein A 7 HH i 
ELORE E E SC2 protein length<15 Tage (90) SC2 protein length<15 Säi f 1034 Protein length <=15aa 
ol gen cor —— MS-observed SC2 protein 04" v —— - MS-observed SC2 protein aj d E3 Protein length >15aa 
T T T T T T T T T T T T A Y e T T T E 
0.00 0.02 0.04 0.06 0.08 0.10 0.00 0.02 0.04 0.06 0.08 0.10 0.00 C.01 0.02 0.03 0.04 0.05 
#peptides predicted to bind / #total peptides #peptides predicted to bind / #total peptides Fraction of peptides with rank «20.5 per allele 


(legend on next page) 


3966 Cell 184, 3962-3980, July 22, 2021 


Cell 


post-translational regulation (Figures 2D and 2E; computed 
Pearson R when omitting S increased from 0.86 to 0.99 and 
0.78 to 0.92 in A549 and HEK293T cells, respectively). 


Kinetics of SARS-CoV-2 protein expression and HLA-I 
peptide presentation 

To investigate the dynamics of HLA-l presentation during infec- 
tion, we compared the relative abundance of HLA-l peptides in 
A549 and HEK293T cells at 3, 6, 12, 18 and 24 hpi. For technical 
reasons, we split the infection time course analysis into two 
batches (3, 6, and 24 hpi and 12, 18, and 24 hpi) and normalized 
to the 24-hpi time point. 

Labeling with tandem mass tags (TMT) enabled detection of 10 
viral HLA-I peptides in A549 cells; four of these peptides were 
quantified across all time points, two were only detected in the 
12118[24-h plex, and four were only detected in the 3|6|24-h plex 
(Figure 3A; Table S1). It is likely that peptides that were detected 
only in the 3|6|24-h plex were also presented on HLA-I at 12 and 
18 hpi, however, because of separate cell culture experiments 
and data acquisition, they were not detected in the 12|18|24-h 
plex. HLA-I presentation of most detected viral peptides peaked 
at 6 hpi, similar to previous reports for vaccinia virus (Croft et al., 
2013) and influenza virus (Wu et al., 2019). Although some hu- 
man-derived HLA-I peptides changed over time, the majority 
were fairly stable. In HEK293T cells, we detected 13 peptides 
from SARS-CoV-2, with the caveat of observing some peptides 
only in the 3|6|24-h plex as described above (Figure 3B; Table 
S1). Examining the dynamics of HLA-I peptides observed across 
all time points, we found that the abundance of some viral peptides 
peaked at 6 hpi; however, we also observed maximal presentation 
at 12, 18 and 24 hpi for others. 

To assess the relationship between HLA peptide presentation 
and the time of viral protein expression, we performed fraction- 
ated whole-proteome MS analysis across the 3, 6, and 24 hpi 
time points from the same cell lysates. Although the majority of 
viral proteins were expressed in cells at 6 hpi, only eight and 
nine proteins were detected at 3 hpi in A549 and HEK293T cells, 
respectively (Figure 3C). We found that viral proteins detected as 
early as 3 hpi contributed to HLA-I presentation more than viral 
proteins expressed at 6 hpi or later (hypergeometric p « 
0.0375; Figure 3D) and elicited stronger CD8- T cell responses 
in COVID-19 convalescent individuals (Tarke et al., 2020) (Wil- 
coxon rank-sum p < 0.0181; Figure SE). This observation may 
explain a recent surprising finding that nsp3 is among the four 
most immunogenic proteins of SARS-CoV-2 (Tarke et al., 


Figure 2. SARS-CoV-2 HLA-I immunopeptidome and whole proteome 


€? CelPress 


2020). Although nsp3 is not expressed at high levels, its early 
expression in infected cells may contribute to presentation of 
nsp3-derived HLA-| peptides. 


SARS-CoV-2 infection interferes with cellular pathways 
that may affect antigen processing 

To investigate how the levels of viral source proteins affect their 
ability to be processed and presented, we ranked the individual 
SARS-CoV-2 proteins and HLA-I peptides according to their 
abundance in comparison with human proteins. Although the 
overall abundance of viral proteins in the infected cells proteome 
at 24 hpi was relatively low (HEK293T, 2.696; A549, 396; Fig- 
ure S3AJ, individual viral proteins were highly expressed and ex- 
ceeded most of the host proteins (Wilcoxon rank-sum test; A549, 
p < 10-4; HEK293T, p < 10-9; Figure 4A; Figure S3B; Table S4). In 
contrast to the high expression of their source proteins, the inten- 
sities of viral HLA-I peptides are similar to peptides from the host 
proteome, indicating that viral peptides are not presented prefer- 
entially (Wilcoxon rank-sum test; A549, p > 0.8; HEK293T, p > 0.4; 
Figure 4B, Figure S3C; Table S1). Moreover, as shown recently 
for influenza virus (Wu et al., 2019), we found that the intensities 
of the viral HLA-I peptides do not directly correspond to their 
source protein abundances (Figures 4A and 4B). 

To assess whether there are global changes in HLA-I antigen 
presentation upon infection, we compared the overlap between 
HLA-l peptidomes of uninfected and infected (24 hpi) A549 cells. 
The overlap among peptides detected in both experiments 
(6296; Figure 4C) was similar to what was observed in biological 
replicates of the same sample (Abelin et al., 2017; Demmers 
et al., 2019). This high overlap and the relatively low HLA-I pep- 
tide representation from viral proteins that are expressed at 6 hpi 
or later (Figure 3D) led us to interrogate the whole-proteome data 
for evidence of viral interference with the antigen presentation 
pathway. Because we analyzed the whole proteome from the 
cell lysate post HLA immunopurification, the levels of HLA-A, 
HLA-B, and HLA-C could not be evaluated. However, all other 
host proteins should remain intact and enable proteomic ana- 
lyses of host responses to infection. 

First, we compared the expression of central HLA-I presen- 
tation pathway proteins (e.9., BAM, ERAP1/2, TAP1/2, and pro- 
teasome subunits) between uninfected and infected cells using 
our fractionated proteome data (7,000 quantified proteins; Fig- 
ure 4D; Figure S3D; Table S4). Although some antigen presenta- 
tion proteins had cell-type-specific expression patterns, we 
observed no significant differences in these proteins upon 


(A) Summary of peptide location across the SARS-CoV-2 genome from the HLA-I immunopeptidome, whole proteome, and predictions. 

(B) Biochemical binding of HLA-I peptides to purified major histocompatibility complexes (MHCs). Shown are the fractions of peptides that were confirmed to bind 
the assigned alleles (half maximal inhibitory concentration [ICs] < 500 nM; Table S2). 

(C) SARS-CoV-2 protein abundance in A549 and HEK293T cells 24 hpi. BAC, intensity-based absolute quantification. 

(D and E) Comparison of our protein abundance measurements 24 hpi and Ribo-seq (Finkel et al., 2020b) in A549 (D) and HEK293T (E) cells. 

(F) HLA-I presentation potential of SARS-CoV-2 ORFs in A549 cells. ORFs were ranked according to the ratio between the number of peptides predicted to bind 


any of the six HLA-I alleles in A549 and the total number of 8- to 11-mers. 
(G) Similar to (F) for HEK293T cells. 


(H) Presentation potential across 92 HLA-I alleles, shown as boxplots (median ratio, whiskers reach to lowest and highest values no further than 1.5 x interquartile 
range [IQR] of the ratio between the number of peptides predicted to bind each allele and total number of peptides). SARS-CoV-2 ORFs are ranked by the median 


across HLA-I alleles. 
See also Tables S2 and S3. 


Cell 184, 3962-3980, July 22, 2021 3967 








OPEN ACCESS 
ALS LIN FN d 


NAI NI 


e CelPress 


A A549 
Human 
— SARS-CoV-2 canonical 
— SARS-CoV-2 non-canonical 











KNIDGYFKIY 
TAQNSVRVL 
EIKESVQTF 
FAVDAAKAY 
VATSRTLSY 
TVIEVQGY 


N 


NATNVVIKV 
FASEAARVV 
VGYLQPRTF 
DEFVVVTV 


Ratio to 24h [log2] 
js o 


-4 
3 6 12 18 24 


Time in h 


wm 


pat a 


20 





earliest expression 
3 hpi 


EZ nsp4 
I) nsp5 
[3 nsp6 
nsp7 
B nsps 
I3 nsp9 
E nsp10 
[3 nsp12 
[3 nsp13 
nsp14 
nsp15 
B nsp16 
EZ ORF3a 
| ORF6 
|». ORF7a 
ORF8 
I) ORF9b 


24hpi S 






—————————— 


—PN2O04 010-100 (OO — n2 00 I O10» -4 CO CO 





SARS-CoV-2 proteins 














HLA-I 
presentation 





D 
+ 
Ww) 
< 


A549 
HEK293T 
A549 
HEK293T 
HEK293T 





3hpi 6hpi 


Figure 3. HLA-I peptides dynamics in SARS-CoV-2-infected cells 


Hypergeometric p « 0.0375 





Cell 


HEK293 
Human 
— SARS-CoV-2 canonical 
— SARS-CoV-2 non-canonical 


25 GPMVLRGLIT 
` SLEDKAFQL 


ELPDEFVVV 
KAFQLTPIAV 
IRQEEVQEL 
AGTDTTITV 


APRITFGGP 
NLNESLIDL 


STSAFVETV 
APHGHVmVEL 
SVVSKVVKV 


FGDDTVIEV 
KRVDWTIEY 













Ratio to 24h [log2] 
[em] 
o 


-5.0 


3 6 12 18 24 
Time in h 


35 


p < 0.0181 


UJ 
o 


earliest expression 
>= 6hpi 


NJ 
Ui 


N 
O 


Ul 
] 
© 
A 
T 
S 


(% response per protein, Tarke et al.) 
o O D 


CD8- T cells response in COVID19 patients 


| 
ul 





3hpi 


>=6hpi 
Earliest expression of SARS-CoV-2 proteins 


(A and B) Dynamics of TMT-labeled HLA-l peptides 3, 6, 12, 18, and 24 hpi in A549 (A) and HEK293T cells (B). TMT intensity values of peptides detected in two 
independent experiments (3, 6, and 24 hpi and 12, 18, and 24 hpi) were normalized to the respective abundance at 24 h present in both experiments. Dashed lines 


indicate detection in the 3|6|24-h plex only. 


(C) Dynamics of SARS-CoV-2 protein expression according to whole-proteome analysis. 

(D) Venn diagram showing SARS-CoV-2 proteins according to their earliest expression time and the source proteins for HLA-I-presented peptides in A549 and 
HEK293T cells. The hypergeometric p value represents the enrichment of early-expressed proteins (3 hpi) in the group of proteins presented on HLA-I. 

(E) CD8+ responses to early/late-expressed SARS-CoV-2 proteins in convalescent COVID-19 individuals according to a recent study. The box shows the 
quartiles, the bar indicates median, and the whiskers show the distribution (see Table S3 in Tarke et al., 2020). 


infection. Of note, HLA-F, which interacts with KIR3DS1 on nat- 
ural killer (NK) cells during viral infection (Lunemann et al., 2018), 
had increased expression in infected cells. 

Next we compared all proteins detected in uninfected and in- 
fected cells to determine whether proteins involved in ubiquitina- 
tion, proteasomal function, antigen processing, and interferon 
(IFN) signaling were altered (Figure 4E; Table S4). We observed a 
general decrease in ubiquitination pathway proteins, with several 
of them depleted significantly in response to SARS-CoV-2 infec- 
tion, including RNF181, UBE2B, and TRIM11. POMP, a chaperone 
critical for assembly of 20S proteasomes and immunoprotea- 
somes, was the most significantly depleted proteasomal protein 
in infected cell lines (p < 0.0095). POMP has been reported recently 
to affect ORF9c stability, which has been implicated in suppress- 
ing the antiviral response (Dominguez Andres et al., 2020). As re- 
ported across multiple cell lines infected with SARS-CoV-2 
(Chen et al., 20208), the tyrosine kinase JAK1, critical for IFN 
signaling, was depleted in A549 and HEK293T cells upon infection 
(Figure 4E). We confirmed the observed depletion of POMP and 
ubiquitination pathway proteins in an independent proteome study 


3968 Cell 184, 3962-3980, July 22, 2021 


(Stukalov et al., 2020) that profiled uninfected and infected A549/ 
ACE2 cells at 6 hpi (Figure SSE) and 24 hpi (Figure 4F). These 
data suggest that SARS-CoV-2 may interfere with IFN signaling 
proteins and the HLA-I pathway through POMP depletion and by 
altering ubiquitination pathway proteins, that in turn, may prevent 
abundant SARS proteins expressed later in infection from being 
effectively processed and presented. 


HLA-I peptides are derived from internal out-of-frame 
ORFs in S and N 

Remarkably, we detected nine HLA-I peptides processed from 
internal out-of-frame ORFs in the coding region of S and N, 
termed S.iORF1 (also known as ORF2b; Jungreis et al., 2021) 
and ORF9b. From S.iORF1/2, we detected three HLA-I peptides 
(GPMVLRGLIT, GLITLSYHL, and MLLGSMLYM) in HEK293T 
cells (Figure 5A). In addition, we detected six HLA-I peptides 
from ORF9b in A549 cells (LEDKAFQL and DEFVVVTV) and 
HEK293T cells (SLEDKAFQL, KAFQLTPIAV, ELPDEFVVV, and 
ELPDEFVVVTV) (Figure 5B). These HLA-I peptides cover over- 
lapping protein sequences and contain binding motifs 


Cell 


A549 24hpi SARS-CoV-2: Full Proteome Protein Ranks 
Wilcoxon rank sum test:W = 37422, p-value = 0.0002544 















Species [^ 
Human 
10.0 @ SARS canonical p 30 
C @ SARS non-canonical ORF OY E 
zc ORF3a Be nsp! o 
eo nsp10 ée maps © 
Z NSPS je ORF/a = 
£ nsp4 v 9 — 
(D * ORF6hSP o 
— 
9 [5 *nsp12 E 25 
A nsp6 a) 
13 15 
= "n pid" id > 
D N 
2 o 
Whole proteome 20 
5.0 
15 
0 2000 4000 6000 
Protein Rank 
C D 
A549 










4623 


A549 24hpi SARS-CoV-2 


HEK293T & A549 24hpi SARS-CoV-2 
n=6933 proteins SARS 
Ubiquitination 
Proteasome 
Ag processing 
IFN 


00000 


UBE2B 
o 


p-value 


TYK2 
0 


RNF31 .UBL5 
RIM1 
a 


UBE2T-PJA2 
d TRAF 


wb 
ERS 





-2 0 2 4 6 
log10 (SARS iBAQ/Uninfected iBAQ) 





107 


10? 


109 


10 ^ 


© Cellress 


OPEN ACCESS 


A549 24hpi SARS-CoV-2: HLA-I Peptide Ranks 
Wilcoxon rank sum test: W = 39936, p-value = 0.8382 


Species 
Human 
@ SARS canonical 
@ SARS non-canonical 


FASEAARVV: nsp2 e 
EILDITPCSF: S e 


EILDITPCSFG: S 
> 


VGYLQPRTF: S FAVDAAKAY: nsp10 
ee P NATNWIRV: gem 
KNIDGYFKIY: S 
EEFEPSTQYEY: nsp3 
LATNNLVVM: nsp2 


*LEDKAFQL: ORF9b HLA-I immunopeptidome 


* DEFVVVTV: ORF9b 


€ VATSRTLSY: M 


0 2000 4000 6000 
Peptide Rank 
A549 
A549 * SARS-CoV-2 
HEK 


HEK * SARS-CoV-2 


62% LSC sc cO co gu Ok oO OO cO d 
(62%) i&a2ü0080 md:mmmemm-dmm-u uuu 
TESTA T1 7731171731 1111111121- 
>, oorr oQomuonuooduodoomuosaoodo 
W LL Cü00000napaononoao 
log, (iBAQ) 
1695(23%) 
6 7 8 9 10 


nsp?  ORF9b S 
ome d O O 





log10 (SARS LFQ/Uninfected LFQ) 


(legend on next page) 


Cell 184, 3962-3980, July 22, 2021 3969 








© Celress 


compatible with the expressed HLA-I alleles. To validate the 
amino acid sequences of these non-canonical peptides, we 
compared the tandem mass spectra of synthetic peptides with 
the experimental spectra and observed high correlation between 
fragment ions and retention times (+2 min; Figure 5C). 

Six of the peptides from out-of-frame ORFs were predicted to 
bind HLA-A*02:01 in HEK293T cells, suggesting potential for 
widespread presentation of these non-canonical HLA-l peptides 
in the population. We confirmed binding for all six peptides using 
biochemical measurements in the presence of a high-affinity ra- 
diolabeled A*02:01 ligand (ICao < 500 nM; Figure 5D; Table S2). 
Interestingly, the three peptides with highest affinity among all 
tested HLA-I peptides originated from out-of-frame ORFs: two 
from S.iORF1/2 (MLLGSMLYM and GLITLSYHL, ICao < 0.5 nM) 
and one from ORF9b (ELPDEFVVVTV, ICao = 1.6 nM). 

In the context of T cell immunity and vaccine development, it is 
crucial to understand the effect of optimizing RNA sequences on 
the endogenously processed and presented HAT peptides 
derived from internal out-of-frame ORFs. Exogenous expression 
of viral proteins in vaccines often involve manipulating the native 
nucleotide sequences (e.g., via codon optimization) to enhance 
expression. These techniques maintain the amino acid sequence 
of the canonical ORF but may alter the sequence of proteins en- 
coded in alternative reading frames. In addition to the two current 
mRNA vaccines targeting the S glycoprotein (Callaway, 2020; 
Jackson et al., 2020; Mulligan et al., 2020), the N protein is also 
considered for vaccine development (Dutta et al., 2020; Zhu 
et al., 2004). 

To investigate the effect of codon optimization on HLA-I pep- 
tides derived from S.iORF1/2 and ORF9b, we compared the 
native viral sequence with synthetic S and N from a SARS-CoV- 
2 human optimized ORF library (Gordon et al., 2020). As expected, 
there was no change in the main ORFs; however, the amino acid 
sequences in the +1 frame encoding S.iORF1/2 and ORF9b were 
significantly different (Figures 5E and 5F). Inthe case of S.iORF1, it 
is possible that this ORF is expressed in the human optimized 
construct because the methionine driving its translation is pre- 
served, however, the sequence of potential HLA-I peptides would 
be different (Figure 5E). In the case of ORF9b, the start codon was 
mutated, few stop codons were introduced along the ORF, and 
the sequence of the detected HLA-I peptides was altered (Fig- 
ure 5F). These data suggest that human codon optimization of 
the main ORF may preclude the HLA-I presentation of peptides 
encoded from alternative ORFs. 


Out-of-frame HLA-I peptides elicit T cell responses in 
humanized HLA-A2 mice and individuals with COVID-19 
To evaluate the immunogenicity of the HLA-I peptides detected 
by MS, we conducted three assays probing T cell responses in a 


Cell 


humanized mouse model, individuals with COVID-19, and unex- 
posed humans. First, we immunized five transgenic HLA-A2 
mice with a pool of 9 A*02:01 peptides for 10 days and tested 
the T cell responses to individual peptides using an INFy ELISpot 
assay. We found positive response to three non-canonical pep- 
tides from out-of-frame ORFs, two from S.iORF1/2 (GLITLSYHL 
and MLLGSMLYM), and one from ORF9b (ELPDEFVVVTV), as 
well as a canonical peptide from nsp3 (YLNSTNVTI) (Figures 
6A and 6B). 

Next we investigated the immunogenicity of the HLA-I pep- 
tides in the context of COVID-19. We performed ELISpot assays 
with peripheral blood mononuclear cells (PBMCs) from six 
convalescent individuals expressing HLA-A*02:01 and moni- 
tored IFNy secretion in response to a pool of 15 HLA-I peptides 
from canonical ORFs and 7 peptides from the out-of-frame 
ORFs. As a positive control, we compared the T cell responses 
with a pool of 102 peptides tiling the N protein measured in the 
same individuals as part of another study (Gallagher et al., 
2021). We observed positive responses to the non-canonical 
pool in two of the six samples (Figures 6C and 6D). Notably, in 
one individual, the T cell responses to the non-canonical pool ex- 
ceeded the responses to the N pool, although the number of 
tested peptides was 14-fold lower (7 versus 102 peptides in 
the non-canonical and N pools, respectively). 

To delineate the T cell responses against individual HLA-I pep- 
tides in humans, we utilized a multiplexed technology combining 
a barcoded tetramer assay and single-cell sequencing of 
epitope-reactive CD8- T cells (Figure 6E; Francis et al., 2021). Us- 
ing this method, we obtained information about (1) the ex vivo fre- 
quency of CD8+ T cells reactive to each peptide in each sample; (2) 
the sequences of the T cell receptors (TCRs; paired a/b chains) 
recognizing each peptide; and (3) gene expression profiles of indi- 
vidual reactive CD8+ T cells. Testing nine HLA-A*02:01 samples 
(seven COVID-19 convalescent and two unexposed), we found 
reactivity to positive control peptides from influenza and SARS- 
CoV-2 (Figure 6F; Table S5A). As expected, HLA-l peptides that 
bind A*02:01 according to our affinity measurements (Table S2) eli- 
cited stronger CD8- responses than peptides that were detected 
on other HLA alleles (Wilcoxon rank-sum p < 10-8; Figure S44). 
Two non-canonical peptides from ORF9b, ELPDEFVVVTV and 
SLEDKAFQL, were in the top five reactive peptides (Table S54). 
Strikingly, ELPDEFVVVTV invoked the strongest CD8+ response 
among all tested HLA-I peptides, with the frequency of detected 
T cells similar to that observed for the influenza epitope and above 
those for three commonly recognized SARS-CoV-2 epitopes: 
YLQPRTFLL, KLWAQCVQL, and LLYDANYFL (Ferretti et al., 
2020). Of note, YLQPRTFLL has been considered the most reac- 
tive SARS-CoV-2 epitope in a few independent studies (Ferretti 
et al., 2020; Habel et al., 2020; Shomuradova et al., 2020). 


Figure 4. The effect of SARS-CoV-2 infection on antigen presentation in host cells 

(A) Rank plot of protein abundances (log10 protein (DAC from human and SARS-CoV-2 detected in the whole-proteome analysis. 

(B) Similar rank plot as in (A) but for observed HLA-I peptide abundance (oo? intensity). 

(C) Venn diagram showing the overlap between total HLA-I peptides in uninfected and infected A549 cells. 

(D) Expression heatmap of central antigen presentation pathway proteins in uninfected and infected cells (24 hpi). 

(E) Volcano plot comparing protein levels in uninfected and infected A549 and HEK293T cells 24 hpi (dashed line, p « 0.01, moderated t test). 
(F) Similar to (E); a volcano plot representing whole-proteome data from A549/ACE2 cells 24 hpi (Stukalov et al., 2020). 


See also Figure S3 and Table S4. 


3970 Cell 184, 3962-3980, July 22, 2021 


Cell 


€? CelPress 


OPEN ACCESS 



































A B 
S (Spike) N (Nucleoprotein) 
ORFS.iORF1 ORF9b(ORFN.iORF 1 
ORFS.iORF2 ORFN.iORF2 
2170 21750 21800 21850 21900 28300 28350 28400 28450 28500 28550 28600 650 


Position on SARS-CoV-2 genome (nt) 
MLLGSMLYMSLGPMVLRGLITLSYHLMMFILLPLRSLT 
t 


oe 
A0201 (rank 2.16) B0702 (rank2.12) 


Position on SARS-CoV-2 genome (nt) 
MDPKISEMHPALRLVDPQIQLAVTRMENAVGRDQNNVGPKVYPIILRL 
GSPLSLNMARKTLNSLEDKAFQLTPIAVOMTKLATTEELPDEFVVVTVK 





B1801, B4403 (rank 0.27, 0.41) 


A0201 (rank 0.43) 
A0201 (rank 0.57) 


C GLITLSYHL (S.iORF 1/2) 
HEK293T + SARS-CoV-2 (24h) 
733.4 













269.2 





B1801 (rank 0.008) 
A0201 (rank 0.42) 


A0201 (rank 0.77) 


LEDKAFQL (ORF9b) 


A549 * SARS-CoV-2 (24h) 


=10| 143.1 3 10 

3 o 

d pa 7214 
pem 19.3 632.3 2 

E 432.2° s 846.5 $5 0.5 850.4 
9 301.1388. F 

2 0.0 Z 0.00 

o 2 

2 © -0.5 

D ® 

E c 

GC -1. 





synthetic peptide 






















synthetic peptide 








250 %00 ja 750 1000 250 500 z 750 1000 
J 
GPMVLRGLIT (S.iIORF 1/2 SLEDKAFQL (ORF9b 
0.4 173.1 
— 406.7 HEK293T + SARS-CoV-2 (24h) pei HEK293T + SARS-CoV-2 (24h) 
s 1.0 © 201.1 
e, 840.5 = ' 
Ea B ue 44422 5943 850.4 
pe 2 844.3 791. 
2 € 0.00 
£0.00 S 
S E 
t -0.5 S -02 
[^] 
X 
X 4.0 synthetic peptide T synthetic peptide 
250 500 750 1000 250 500 750 1000 
D m/z E m/z 
ads : : Canonical Spike in the region of S.iORF1 
— Canonical — Non-canonical non A*02:01 

PO SERM SARS-CoV-2Spie 1 NVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSN 39 

BN Ss suman optimizes sss 1H LLLLLLELELEEEE EET LLLEELELL LLL TI 
= (Krogan library) ` ` NVIWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSN 39 
2 80 
c GLITLSYHL 
S 6 (ORFSJORELM) S.iORF1/2 (*1 reading frame) 
Q 
z SARS-CoV-2 Spike 1 MLLGSMLYMS MMVFILLPLRSLT 39 
pb FEXELIESP all's $ beware) aif 

i qoem ™ Human optimized Spike 1 MSRGSMLFTE MAYISRQORKVI 39 





(Krogan Library) 





Peptide [log10 nM] 


SARS-CoV-2 Nucleocapsid 1 


Human optimized Nucleocapsid 1 
(Krogan library) 


SARS-CoV-2 Nucleocapsid 51 


Human optimized Nucleocapsid 51 
(Krogan library) 
SARS-CoV-2 Nucleocapsid 1 


Human optimized Nucleocapsid 4 
(Krogan library) 






SARS-CoV-2 Nucleocapsid 





F Canonical Nucleocapsid in the region of ORF9b 
NGPONORNAPRITFGGPSDSTGSNONGERSGARSKORRPOGLPNNTASWF 50 
Hii EH EE EE EE EE EE EE E EE EE ELE LEEE ELLE LEE E I 
NGPONORNAPRITFGGPSDSTGSNONGERSGARSKORRPOGLPNNTASWF 50 
TALTOHGKEDLKFPRGQGVPINTNSSPDDOIGYYRRATRRIRGGDGK 97 
LEE EEEEEEEEEEE ELLE EE ELE EELE LEE EEE LL EEE ELT TL F ] 
TALTQHGKEDLKFPRGQGVPINTNSSPDDQIGYYRRATRRIRGGDGK 97 

ORF9b (+1 reading frame) 
MDPKISEMHPALRLVDPQIQLAVTRMENAVGRDQNNVGPKVYPIILRLGS 50 
sé) aire Kees Bees errs £e] Peaux rod d ss 
TAPKTNGMPLG*HLVALVIPQEVTKTENGLEPALNRGDHRDCOITORHGL 50 

51 PLSLNMARKTLNS| 97 
eee b els 
97 


Human optimized Nucleocapsid 51 QP*LNMERRT* 
(Krogan library) 





(legend on next page) 


Cell 184, 3962-3980, July 22, 2021 3971 





C Celress 


Examining the gene expression profile and the TCR sequence 
of the reacting T cells provided additional supporting evidence 
for the functional relevance of the ELPDEFVVVTV epitope during 
the course of COVID-19. Most cells reactive to ELPDEFVVVTV 
showed high expression of effector markers and moderate to 
high expression of memory markers based on gene sets 
described in a recent COVID-19 CD8+ subpopulation profiling 
study (Figure 6G; Su et al., 2020). In addition, the TCR sequences 
of CD8+ T cells reactive to ELPDEFVVVTV revealed significant 
CDR3 homology across affected individuals (Figures S4B-S4D). 

Although our T cell data provide evidence of CD8+ responses 
to peptides from ORF9b in individuals with COVID-19, we did not 
detect significant responses to HLA-I peptides from S.iORF1/2, 
GLITLSYHL and MLLGSMLYM, in the seven tested COVID-19 
samples. To evaluate the immunogenicity of the third HLA-I pep- 
tide from S.IORF1/2, GPMVLRGLIT, we performed an additional 
barcoded tetramer assay with PBMCs from individuals with 
COVID-19 expressing HLA-B*07:02. We observed the expected 
positive reactivity to control peptides from EBV (RPPIFIRRL) and 
SARS-CoV-2 (SPRWYFYYL) as well as overall greater CD8+ re- 
sponses to HLA-I peptides that bind B*07:02 (Wilcoxon rank- 
sum p < 10-10; Figures S4E and S4F; Table S5B). However, 
we found no significant responses to GPMVLRGLIT in affected 
individuals, although we detected this peptide multiple times in 
our MS experiments (Table S1). It is possible that our assay 
was not sensitive enough to capture T cell responses to the three 
non-canonical peptides from S.iORF1/2 because we also 
observed weak responses to KLWAQCVQL, a commonly recog- 
nized A*02:01 epitope in individuals with COVID-19 (Ferretti 
et al., 2020; Takagi and Matsui, 2020), exhibiting similar reactivity 
as GLITLSYHL from S.iORF1/2. 


SARS-CoV-2 HLA-I peptides can be presented by 
additional alleles in the population 

Increasingly accurate HLA-I presentation prediction tools are 
applied routinely to the full transcriptome or proteome of an or- 
ganism to computationally nominate presentable epitopes. 
However, these tools are trained on data that are agnostic to vi- 
rus-specific processes that may interfere with the presentation 
pathway. Thus, the sensitivity and specificity of in silico predic- 
tions for any particular virus are characterized insufficiently. To 
assess how well computational tools would recover the MS- 
identified HLA-I peptides, we used HLAthena (Abelin et al., 
2017; Sarkizova et al., 2020) to retrospectively predict all 8- to 
11-mer peptides tiling SARS-CoV-2 proteins against the com- 
plement of HLA-I alleles expressed by A549 and HEK293T cells 


Figure 5. SARS-CoV-2 HLA-I peptides from S.iORF1/2 and ORF9b 


Cell 


(Figure 7A; Table S6A). Of the 36 MS-identified peptides, 23 had 
a predicted percentile rank (Yorank) of less than 0.5, and 31 had a 
Yrank of less than 2. 

Within 39,875 possible SARS-CoV-2 8- to 11-mers, 14 of 18 
A549 HLA-I peptides and 11 of 18 HEK293T peptides had % 
rank scores within the top 1,000 viral peptides (top 1.5% and 
1.7% for A549 and HEK293T cells, respectively). To account 
for variability in viral protein expression levels, we repeated this 
analysis within the source protein of each peptide. We found 
that 16 of 36 peptides scored within the top 10 among all 8- to 
11-mers of the source protein, and 21 scored within the top 
20. These observations suggest that, although an in silico 
epitope prediction scheme that nominates the top 10-20 pep- 
tides of each viral protein would recover ~50% (16-21 of 36) of 
observed epitopes with very high priority, this list would still 
only encompass ~5%-10% true LC-MS/MS positives (16-21 
of 10 x #proteins). 

Next we estimated the HLA allele coverage achieved by the 
observed endogenously processed and presented viral epitopes 
among African Americans (AFA), Asian Pacific Islanders (API), 
European (EUR), Hispanic (HIS), United States, and world popu- 
lations at different %rank cutoffs based on HLAthena predictions 
across 92 HLA-I alleles (Figure 7B; Figure S5; Tables S6B and 
S6C). At the second most stringent cutoff, 96rank of 0.5 or 
less, 31 of the 36 individual peptides were predicted to bind at 
least one allele (range, 1-21; median, 4.9; mean, 4.5). Combined, 
the MS-identified peptide pool was estimated to cover at least 
one HLA-A, HLA-B, or HLA-C allele for 9996 of the population 
with at least one peptide. 

To validate the predicted binding of the HLA-I peptides, we 
performed biochemical binding measurement with 30 synthetic 
peptides and 5 HLA alleles not present in the two profiled cell 
lines. We confirmed binding for 5 of 9 (5696) HLA-I peptides pre- 
dicted at a 0.5 %rank threshold and 12 of 29 (4196) peptides pre- 
dicted at a %rank threshold of 2 (Figure 7C), with significantly 
higher measured affinities for predicted binders versus non- 
binders (Figure 7D; Table S2). Moreover, two peptides with 
predicted presentation on HLA alleles not profiled in our cell 
lines have been found recently to elicit T cell responses in conva- 
lescent COVID-19 individuals expressing the predicted alleles 
(EILDITPCSF and QLTPTWRVY, detected on A'25:01 and 
C*16:01, were predicted to bind A*26:01 and A*30:02 at a 96 
rank of 0.5 or less, respectively; Table S7; Tarke et al., 2020). 
These results indicate that HLA-I immunopeptidomics on only 
two cell lines, combined with epitope prediction tools, can help 
prioritize CD8+ T cell epitopes with high population coverage. 


(A) HLA-I peptides derived from S.iORF1/2. Underscored methionines (M) represent the start codons of S.iORF1 and S.iORF2. 


(B) HLA-I peptides derived from ORF9b (N.iORF1) and N.iORF2. 


(C) Mirror plots with fragment ion mass spectra confirming the sequences of four HLA-I peptides that were identified in S.JORF 1/2 and ORF9b (positive y axis, HLA 


IP samples; negative y axis, synthetic peptide). 


(D) Biochemical HLA-A*02:01/peptide binding measurements. The concentration of peptide yielding 5096 inhibition of the binding of the radiolabeled A*02:01 


ligand (IC59) was used to calculate peptide affinity. 


(E) The effect of human codon optimization on HLA-I peptides derived from S.iORF1/2. Shown is Needleman-Wunsch pairwise global alignment between the 
SASR-CoV-2 sequence (NC. 045512.2) and the human optimized S from the Krogan library (Gordon et al., 2020) in the S.IORF1/2 coding region. Purple boxes 


indicate the position of the HLA-I peptides in the out-of-frame ORFs. 
(F) similar to (E) but for N in the ORF9b coding region. 


3972 Cell 184, 3962-3980, July 22, 2021 


Cell 


^ 

































© Cellress 


OPEN ACCESS 

















leiden 


X 7 11 10 
> 9 € = PS HIV ' , 
` anonical E 
ES EE "^ ES Non-canonical 
i Kis —Ar e I — wen gegen 
A 1 mouse 2 y 13 14` 12 
5 í 
ER 9. (-) entrl 
| Immunization | 2 8 "d 
Days Eë ET ET 
í | | 268 1597999. 1634540. 158 1 
ot Ayes, rä H" 
0 10 239 GLITLSYHL EE ENS 
«9 ES conu) Weg Gamay Wee 
m ad 2N oi "Ze 45 © u; 
a 5 2 5 ELPDEFVVVTV, 7 dai 
a 0 CG (ORF9b) €. Vice 29. oe 
3; 21489. 38609 75 7390 $7 
MLLGSMLYM — fn oU 
Y = z S.iORF1) Je, ^ 
e > st mMm 4 E > > > ( ; 9 
e T za Fz Ise Sab HF @ $24, De, e — i 
E E EEFE re E 
~ E SÉ Oe SP HG oP konoa äi a YLNSTNVTI Zieser: 8. 
z DR EObk on 2egcqgcos Gg i 5o... 
42a no Jag LE O0 u-ut9 (nsp3) "TO Lat : ` 
Uu ~a~ = Kach Ww RE LLI si KK e 
== 
LU 
. Non- 
C D (-) Cntrl Canonical 
canonical 
7 100 HM Canonical (15 peptides) 0 dim 0 
E: PS Non-canonical (7 peptides) Ex T 
p g EE N (102 peptides, Gallagher et al.) z J 
m J 
Uo 80 2 0 0 
— O © ^ 
EE E i 
cs y 
29 60 
OG EA 
CN Non- 
EO (-) Cntrl Canonical N 
OZ Ap canonical 
L m 
EE 
e 0 e 
© 20 - 
o c 
= o 
= 
D 
, d oO 
Patient: 1 2 3 4 5 6 
HLA-I A*02:01/02:01 A-02:01/32:02 A*02:01/24:02 A*02:01/02:05 A*01:01/02:01 A*02:01/68:01 
B*08:01/08:01 B*13:02/14:02 B*15:01/44:02  B*40:01/58:01  B*41:01/44:02  B*08:01/52:01 
alleles: C*07:01/07:01 C*06:02/08:02  C*03:02/05:01 C*03:09/07:01 C*05:01/17:01 ©C*07:02/12:02 
E F G leiden 
Reactivity E TT 
influenza `" BEER oco [ 16 
-1.2 
YLOPRTFLL Los N 
Gi SARS-CoV-2 KLWAQCVQL f & 
aes Isolated human " dus qos z 
"Ir CD8+T Cells Nog = 
^ da | FASEAARVV 
encoded pMHC r* AL y | nsp2 cannes 
tetramers D e FGDDTVIEV 
8 J alie d iN — Dep YLNSTNVT| 
zo FAR Vu, S smwn 
ran r^ hu eg N À ELPDEFVVV 
kl ! - N 2v ORF9b ELPDEFVWIV > Ss 
' v NI x KAFQUTPIAV — ^, L0 e SF a uo 
G N SLEDKAFQL | L.. S & ZZ d 
Y SiORF1/2 | GLTLSYHL — | d 
` MLLGSMLYM \ 
nsp1 B APHGHVMVEL \ > 
decoding ipa | EIKESVQTF \ d 
LATNNLVVM ‘ 
a, MHC nsp3 §j EEFEPSTQYEY d 
tetramer (+) CASSRRSTGELFF “> — GILGFVETL nsp8 B SEFSSLPSY i : 
j j CWTDSWGKLOF nsp10 B FAVDAAKAY A \ 
single cell encapsulation GE, — WOPRTFL nsp14 B inis i t------ ABRvS 
& sequencing | . | |— S ae PL tara Me o ME VATSRTLSY d i 
š = Lüsuowor vr ame) A ` ms sien TE 
— ám We EILDITPCSF ; ' 
* me —— m * \ 
b A — n <x EILDITPCSFG i > 
a -— p c HADQLTPTW | 
ei "2m < 9 S KNIDGYFKIY \ : 2 
f i d = = \ : 9 
hus | A NATNVVIKV \ ! 5 
h A QLTPTWRVY d 3 
> UMAP2 VGYLQPRTF i i 2 
NE APRITFGGP d = 
ORF9b DEFVVVTV yod = 
LEDKAFQL i « 
S.jORF1/2 Bl GPMVLRGLIT A. 
123456789 hb 
= m-— vi 
d 
COVID-19 unexposed de 
u 


convalescent 


Cell 184, 3962-3980, July 22, 2021 3973 





(legend on next page) 








© Celress 


DISCUSSION 


We provide the first view of SARS-CoV-2 HLA-I peptides that are 
endogenously processed and presented by infected cells. 
Although our study profiled two cell lines, it uncovers insights 
into SARS-CoV-2 antigen presentation that extend beyond the 
nine HLA alleles tested here. (1) A substantial fraction, 9 of 36 
(2596), of viral peptides detected are derived from internal out- 
of-frame ORFs in S (S.iORF1/2) and N (ORF9b). Remarkably, 
HLA-I peptides from non-canonical ORFs were strongly immu- 
nogenic in immunized mice and convalescent COVID-19 individ- 
uals, as shown by pooled ELISpot and multiplexed tetramer 
assays. These observations imply that current interrogations of 
T cell responses in individuals with COVID-19, which focus on 
the canonical viral ORFs (Grifoni et al., 2020a; Weiskopf et al., 
2020), exclude an important source of virus-derived HLA-I epi- 
topes. (2) A large fraction of detected HLA-I peptides were 
from nsps. Although earlier studies focused mostly on T cell re- 
sponses to structural proteins, this finding, together with recent 
studies that expanded their epitope pools to include nsps (Dan 
et al., 2020; Kared et al., 2021; Tarke et al., 2020), portray nsps 
as an integral part of the T cell response to SARS-CoV-2. (3) 
The timing of SARS-CoV-2 protein expression appears to be a 
key determinant for antigen presentation and immunogenicity. 
Proteins expressed earlier in infection (3 hpi) were more likely 
to be presented on the HLA-I complex and elicit a T cell response 
in individuals with COVID-19. 

Recent findings highlight the need to look beyond antibodies 
for strategies to achieve long-lasting protection against COVID- 
19 (Ledford, 2021). Several newly emerged SARS-CoV-2 vari- 
ants are poorly neutralized by antibodies raised against the 
parental isolates used in the current vaccines (Chen et al., 
2021; Wu et al., 2021). Importantly, recent studies have shown 
that CD8- T cell responses are not substantially affected by mu- 
tations found in prominent SARS-CoV-2 variants (Redd et al., 
2021; Tarke et al., 2021). Thus, integrating T cell epitopes into 
the design of next-generation vaccines has the potential to pro- 
vide prolonged protection in the face of emerging variants. Our 
work reveals that ORF9b is an important source of T cell epi- 
topes that remains largely unexplored in the context of T cell im- 
munity. Although relatively short (97 amino acids [aa], ORF9b 


Figure 6. T cell responses to SARS-CoV-2 HLA-I peptides 


Cell 


yielded six HLA-I peptides (16% of total detected peptides) in 
A549 and HEK293T cells that bind at least four different alleles 
(A*02:01, B*18:01, B'44:03, and A'26:01). We identified two 
A*02:01 peptides, ELPDEFVVVTV and SLEDKAFQL, that elicit 
CD8+ T cell responses in convalescent individuals, demon- 
strating that ORF9b is translated and presented on HLA-I in-vivo 
during the course of COVID-19. Moreover, ORF9b is highly ex- 
pressed and among the few viral proteins that are detected 
early in infection, two traits that correlate with HLA-I presenta- 
tion and immunogenicity. Specifically, our study highlights 
ELPDEFVVVTV as a promising T cell epitope. It binds A*02:01 
and A'26:01, elicits strong T cell responses in immunized 
mice and individuals with COVID-19, and is recognized by 
TCHs from different affected individuals sharing a mutual 
CDRS motif. Importantly, ELPDEFVVVTV elicits stronger T cell 
responses (in five of seven individuals studied here) than the 
three most commonly recognized A*02:01 SARS-CoV-2 epi- 
topes (Ferretti et al., 2020), including YLQPRTFLL, which was 
recorded as the most potent SARS-CoV-2 epitope in three 
independent studies (Ferretti et al., 2020; Habel et al., 2020; 
Shomuradova et al., 2020) and is the target of commercial 
monomer and tetramer assays. 

In contrast to ORF9b, S.iORF1/2-derived peptides did not 
elicit significant T cell responses in convalescent COVID-19 indi- 
viduals. This finding is surprising, given that GLITLSYHL and 
MLLGSMLYM had the highest affinity to HLA-A*02:01 among 
all HLA-I peptides tested and were immunogenic in a humanized 
mouse model, demonstrating that they can elicit T cell responses 
in vivo. Moreover, GLITLSYHL immunogenicity in mice was 10- 
fold higher than ELPDEFVVVTV, the most potent SARS-CoV-2 
epitope detected in individuals with COVID-19, with responses 
comparable only with an Influenza epitope. The discrepancy be- 
tween the immunogenicity of S.iORF1/2-derived peptides in 
mice and individuals with COVID-19 could suggest an immune 
evasion mechanism to attenuate the translation and/or antigen 
processing of these non-canonical ORFs in affected individuals. 
Testing T cell responses in convalescent samples, as done in our 
study, is biased toward symptomatic individuals, and perhaps 
T cell reactivity to these peptides is associated with asymptom- 
atic infection. Interestingly, although the sequence encoding 
the canonical and ORF9b-derived HLA-I peptides remained 


(A) Five HLA-A2 transgenic mice were immunized with a pool of nine HLA-I peptides detected on A*02:01 in HEK293T cells for 10 days. Splenocytes were 
incubated with individual peptides and monitored for IFNy secretion. HLA-A*02:01 restricted HIV-Gag peptide and non-stimulated wells were used as negative 
controls. Anti-CD3 and phytohemagglutinin (PHA) were used as positive controls. The dashed line represents the threshold for positive responses (3 x the median 
of the HIV-Gag). The box shows the quartiles, the bar indicates median, and the whiskers show the distribution. 

(B) ELISpot images from one of the five vaccinated mice. Numbers indicate the spot count. 

(C) PBMCs from convalescent COVID-19 individuals expressing A*02:01 alleles were incubated with a pool of HLA-I peptides from canonical or out-of-frame 
ORFs. A pool of 102 peptides tiling the entire nucleocapsid (N) protein that was evaluated in the same samples (Gallagher et al., 2021) served as positive control. 
Bars show the mean of duplicates. 

(D) ELISpot images of individuals #1 and #3. Numbers indicate the spot count. 

(E) Illustration of the multiplexed tetramer assay and T cell single-cell profiling. 

(F) CD8- T cell reactivity detected in convalescent COVID-19 individuals and unexposed subjects expressing A*02:01 to individual HLA-I peptides. The score in 
the heatmap indicates the fraction of peptide-specific reacting T cells from total CD8- cells in the sample. 

(G) Single-cell transcriptomics of reactive T cells. Top panel: uniform manifold approximation and projection (UMAP) embedding of all tetramer-positive cells 
colored by unsupervised clustering. Center panel: expression levels of 15 genes associated with different states of T cells, as characterized previously (Su et al., 
2020). Bottom panel: expression level of these 15 genes in individual T cells reactive to ELPDEFVVVTV peptide from ORF9b. 

See also Figure S4 and Table S5. 


3974 Cell 184, 3962-3980, July 22, 2021 


Cell 
















































































© Cell?res 


ree 































































































































































































O P E N A C CE o S 
A B Predicted presentation across 92 HLA class | alleles Alleles sorted by loci and 
Population: World; peptides sorted by coverage population coverage 
Hof | Ha H | YLFDESGEFKL (nsp3) 1 40201 83501 fj covo2 
rank ` "rank %rank alleles ES FASEAARVV (nsp2) -Á Bann B B5101 B Coa? 
M is m f m r  FGDDTVIEV (nsp3) J == BS A0206 B B4001 B co304 
best best within all within all rank ` within ` with E FAVDAAKAY (nsp10) JEE 
viral [predicted |predicted wi ` viral ` |within viral ` rank EN - E | SWSKVVKV(nsp15) _ eg Bum io 
peptide protein rank lalele (peptides peptides (protein protein Je? |allalleles with rank «-2 zl  KRVDWTIEY (nspi4) _— Bo» No» NI 
Hm  TAQNSVRVL(nsp2) Lë Gaam [180702 | coso2 
VATSRTLSY [M 00007 C1601 2 0005 1 0.1171] 2 C1203,01601 Bm ELPDEFVVVTV (ORF9b) - assum 40203 [B B0704 H co303 
48M j L J mg 
FAVDAAKAY — eg 0007 ` Ce 8 um — 4| ug 3 | A2501,C1203,C1601 —h noe), - ss RR e e Pe 
FASEAARVV ` |nsp2 0.0049 C1601 18 0.045 21 00794 2 C1203,C1601 Ot Re "VT 6802 E B4601 E c1202 
L lg — 
DEFVVVTV ` |ORFOb 0.0083) B1801 24 0.060 2| 0.5650) 2 B1801,B4403 | SLEDKAFQL(ORFOb) Es: Ms Pe 
mae |g mm c — 48 Oifq — 8 Om 2 Ct203.C1601 L LEDKAFOL (OMFG) tt A2501 [B4402 C0302 
EIKESVQTF ` men 0.0184 A2501 47 0.118 3 014191 1 A2501 | DEFVWIIV(ORF9b jim A3601 | | B4006 C1601 
| GLITLSYHL (S.iIORF1/2) =F A0211  |B1302 C0802 
EEFEPSTOYEY |nsp3 0.0194} ` Ban 49) um — 18 02324 3 A2501,B1801,84403 |  VATSRILSY (M) ` gems 0204 Bissen M c1203 
m ; KAFQLTPIAV (ORFS) esse 
SEFSSLPSY  |nspà 0.0246 B180 59 0.148 1 041319 2 B1801,84403 d KNIDGYFKIY (S) | | B5401 C0403 
Oo | a i 
St TVIEVOGY ` |nsp3 00528 and mm ua aw 04389 1 A2501 | EEFEPSTQYEY (nsp3) | = ec a 
d SEFSSLPSY (nsp8)  - me B3507 C0704 
LATNNLVVM ` |nsp2 0.0848 C1203 189 0474 18) 0.7149) 2 €1203,01601 EIKESVQTF (nsp2) JB: B2705 | unassigned 
TTTIKPVTY ` wen 0.0869 ` Cem 196, — 0492 51| 0.6584} 4 |A2501,81801,1203,C1601 EO ius =. Se 
EILDITPCSF |S 0.0955 A2501 210 0.527 32 06279 2 A2501,81801 | ELPDEFVVV (ORFS) —_ B5502 
SCH —— EE ` Cutrtcet ei ` me 
TAQNSVRVL ` |nsp2 01829 ` Ce 308 0.998 Au 1693 2 C1203,C1601 M | srsarvETV (nsp2) pae B4501 
F | B5001 
LEDKAFQL _ |ORFOb 02740 mmm on mn ng 16999 2 B1801,84403 un ees? jm p 
VGYLQPRIF |S 06251 om wm mm 185 3639 1 C1601 |  NINESLIDL(S) ec" HI, B4901 
1601 VGYLQPRTF(S) |j rank <= 0.1 Ge 
OLTPTWRVY |S 0.7696, oan mm — 4602 238 46708 3 A2501,C1203,C1601 EILDITPCSFG(S) 4 EE rank <= 0.5 se 
mm , APRITFGGP (N) B rank <= 1.0 
KNIDGYFKIY |S 0.8731) B180 204 — 5201 265) 52000 4 |B1801,B4403,C1203,C1601 baten EE 
EILDITPCSFG |S 4.5200 A2501 10633 26.666) 4 1274 25.0000 0 unassigned | GPMVLRGLIT (S.iORF1/2)4 
err PepsCombined — 
YLFDESGEFKL |nsp3 0.0052, A0201 6 — 0015 1| 00129 2 A0201,C0702 PASUDAPMPRESPEPET z Dom HS WEN ose 
4 17 15 13 11 9 7 5 3 1 i : : i 
EE UNDIS oom a LN. ae a # Predicted HLA class | alleles, %Yrank<=0.5 Estimated population coverage (World) 
APHGHVMVEL  nspl 0.1099 B0702 95 0.238 3 0.4373) 1 B0702 
IROEEVQEL  |ORF7a 01322 C0702 122 0.306 3 06007 1 C0702 C D 
YLNSTNVTI ` Iren 0.1354) ` A0201 126) 0.316 — 24 0308 1 A0201 m B rank«0.5 —1 p < 1e-6 : 
SVVSKVVKY ` reng 01482 -A0201} — 15 — 0394 5) 030 1 A0201 _ 100+ H rank<2 e Be 
o e 
E GLITLSYHL [SJORF1/2| 02303, A0201 25 0.614 3 3393 1 A0201 o £ Ef 10000 + ` 
Ci gLPDEFVVVIV |ORF9 | won wm wm um — 9| 2804 ! A0201 S & gol @ 900" E m 
Q i] 
T SLEDKAFQL  |ORF9b 0.4349 A0201 512 1.284 ID 2.8249) 1 A0201 = eg 2/3 B 1000 - e eh 
o @ 
T eenmmare nsp3 0.5348, A0201 69 4157, 14 14717 1 A0201 $9 ES D 500r g Q9 
C SE = 2 
KAFQLTPIAV |ORF9b 0.5696 A0201 682 1.710 121 3.3898) 1 A0201 D B 60 EE 400 = o 
o L 
ELPDEFVVV ` |ORFOb 0.7673 wm wg 2523 18, 5.0847) 1 A0201 c i = 50+ 9 o o 
oO o E 
NINESLIDL |S ou wl mm mm o 2743 ! A0201 Bo 40- = 2 
c Q LO E) 
STSAFVETV  |nsp2 1.3088 A0201 1866 4.680 84 33360 1 A0201 = m 1/4 o Es o 
® 
GPMVLRGLIT |S.iORF1/2 2.1237 B0702 3184 7.985 18} 20.0000) 0 unassigned 8 i 20- P 
; as 3 = o 9 
MLLGSMLYM  |SjORF1 2.1581 A0201 3238 8.120 23, 18.8525) 0 unassigned = 1r 
He o 
APRITFGGP |N 3.0563 B0702 4101 11.990 151 9.1961) 0 unassigned "NOS " 0/0 0/4 8 
ll l 
AGTDTTITV |nsp5 34456 wm wei mm — 129| 10.8403} 0 unassigned = = = m = Predicted Predicted 
= 2 B = = binder non-binder 
E Q an à e (rank <= 2) (rank>2) 


Figure 7. Presentation prediction and population coverage estimates of MS-identified SARS-CoV-2 HLA-I peptides 
(A) Summary of LC-MS/MS-identified SARS-CoV-2 epitopes with corresponding HLAthena predictions for the 6 HLA-I alleles expressed by A549 cells and the 


3 HLA-I alleles in HEK293T cells. 


(B) HLAthena predictions for 92 HLA-I alleles. Left: the number of unique HLA-I alleles predicted as strong binders. Right: estimated population coverage. Alleles 
are colored and ordered according to loci and world population frequency (high to low color intensity). 
(C) Biochemical binding measurements of HLA-l peptides and five HLA alleles that were not profiled in our cell lines. Shown are the fractions of peptides that were 


confirmed to bind the predicted alleles (ICs < 500 nM; Table S2). 


(D) IC5o nM affinity measurements of HLA-l peptides for nine alleles separated by predicted binders (96rank < 2) and predicted non-binders (96rank > 2) (Welch 
Two Sample t test, data are presented as median, whiskers reach to lowest and highest values no further than 1.5x IQR). 


See also Figure S5 and Table S6. 


unchanged in the recent emerging SARS-CoV-2 variants 
B.1.1.7, P.1, and B.1.351 (originally detected in the United 
Kingdom, Brazil, and South Africa, respectively; Rambaut 
et al., 2020), the three HLA-I peptides derived from S.iORF1/2 
were mutated (Figure S6; Table S8): S 69-70 deletion in the 
B.1.1.7 variant results in deletion of the last two amino acids of 
the S.iORF1-derived MLLGSMLYM epitope, whereas D80A sub- 
stitution in canonical S of the B.1.351 variant results in l-to-L sub- 
stitution in GPMVLRGLIT and GLITLSYHL. This could be due to 
relaxed selective pressure, allowing mutations to emerge, or 
may point to potential positive selective pressure on these 
T cell epitopes encoded from the alternative out-of-frame 
ORFs. Further studies, including testing T cell responses in 


asymptomatic subjects, are needed to elucidate the potential 
role of S.iORF1/2-derived peptides in COVID-19. 

Our analyses demonstrate that synthetic approaches aiming 
to enhance the expression of canonical ORFs, some of which 
are utilized in current vaccine strategies, can inadvertently elim- 
inate or alter production of HLA-I peptides derived from overlap- 
ping reading frames. Researchers may need to carefully examine 
the effect of sequence manipulation and codon optimization on 
internal overlapping ORFs, especially those encoding HLA-I 
peptides. In broader terms, many viral genomes have evolved 
to increase their coding capacity by utilizing overlapping ORFs 
and programmed frameshifting (Ketteler, 2012). Thus, our find- 
ings suggest a more general principle in vaccine design 


Cell 184, 3962-3980, July 22, 2021 3975 








© Celress 


according to which optimizing expression of desired antigens 
using codon optimization can be at the expense of CD8+ 
response when the same region encodes a source protein for 
T cell epitopes in an alternative frame. Combining insights from 
ribosome profiling and HLA-I immunopeptidomics can uncover 
the presence of non-canonical peptides that will enable more 
informed decisions in vaccine design. 

Proteomics analyses of infected cells show that SARS-CoV-2 
may interfere with presentation of HLA-I peptides and expres- 
sion of ubiquitination and immune signaling pathway proteins. 
We found that SARS-CoV-2 infection leads to a significant 
decrease in expression of POMP and ubiquitination pathway 
proteins. By affecting ubiquitin-mediated proteasomal degrada- 
tion and immune signaling proteins, SARS-CoV-2 may reduce 
the precursors for downstream processing and HLA-I presenta- 
tion and alter the immune response. The effects of SARS-CoV-2 
on HLA-I presentation may be influenced by additional factors. 
such as translation inhibition by nsp1 (Schubert et al., 2020) 
and degradation of host transcripts (Finkel et al., 2020c), that 
can diminish antigen presentation by attenuating the expression 
of HLA-I molecules. Moreover, a recent study reports that ORF8 
protein disrupts HLA-l antigen presentation and reduces recog- 
nition and the elimination of virus-infected cells by cytotoxic T 
lymphocytes (CTLs) (Park, 2020; Zhang et al., 2020). Further 
research is needed to directly probe the various effects of 
SARS-CoV-2 on HLA-I antigen presentation. 

Our work uncovers previously uncharacterized SARS-CoV-2 
HLA-I peptides from out-of-frame ORFs in the SARS-CoV-2 
genome and highlights the contribution of these viral epitopes 
to the immune response in a mouse model and convalescent 
COVID-19 individuals. These new CD8+ T cell targets and the in- 
sights into HLA-I presentation in infected cells will enable more 
precise selection of peptides for COVID-19 immune monitoring 
and vaccine development. 


Limitations of the study 

The results of this study should be interpreted within the context 
of its technical limitations. First, immunopeptidome profiling was 
performed in infected cell lines and may not capture the in vivo 
conditions in a faithful manner. Nevertheless, T cell responses 
in affected individuals to HLA-I peptides, including non-canoni- 
cal epitopes, support the /n vivo presentation of at least some 
of the peptides reported in this study. Second, our study spans 
nine HLA alleles expressed endogenously expressed in two 
cell lines. Further studies of SARS-CoV-2-infected cell lines 
from diverse lineages and primary tissues expressing different 
HLA alleles will likely facilitate identification of additional epi- 
topes. Third, LC-MS/MS based assays can suffer from false neg- 
atives when peptide abundance is below the limit of detection or 
the sequence does not ionize well. 


STARXMETHODS 


Detailed methods are provided in the online version of this paper 
and include the following: 


e KEY RESOURCES TABLE 
e RESOURCE AVAILABILITY 


3976 Cell 184, 3962-3980, July 22, 2021 


Cell 


O Lead contact 
O Materials availability 
O Data and code availability 
e EXPERIMENTAL MODEL AND SUBJECT DETAILS 
O Human Subjects 
O Cell culture 
O SARS-CoV-2 virus stock preparation 
O HLA-A02 transgenic mice 
e METHOD DETAILS 
Quantification of virus infectivity 
Plaque assay 
Immunoprecipitation of HLA-| complexes 
HLA-lI peptidome LC-MS/MS data generation 
Whole proteome LC-MS/MS data generation 
LC-MS/MS data interpretation 
Validation of peptide identifications 
RHNA-Seq of SARS-CoV-2 infected cells 
In-vitro MHC-peptide binding assay 
Vaccinations of HLA-A2 transgenic mice 
Mouse ELISpot assay 
ELISpot assay with COVID-19 PBMCs 
Multiplexed tetramer assay 
Single-cell Sequencing 
HLA-I antigen presentation prediction 
HLA allele frequencies and coverage estimates 
e QUANTIFICATION AND STATISTICAL ANALYSIS 
O Whole proteome analysis 
O Infected cells RNA-seq reads alignment 
O Scoring pMHC-TCR interactions 
O Toells transcriptomics analysis 
O Clustering and Annotation of single cell data 


O 


DOO X QD EH OO Qu Oo 


SUPPLEMENTAL INFORMATION 


Supplemental information can be found online at https://doi.org/10.1016/j.cell. 
2021.05.046. 


CONSORTIA 


The members of the MGH COVID-19 Collection & Processing Team are Ken- 
dall Lavin-Parsons, Blair Parry, Brendan Lilley, Carl Lodenstein, Brenna 
McKaig, Nicole Charland, Hargun Khanna, Justin Margolin, Anna Gonye, Irena 
Gushterova, Tom Lasalle, Nihaarika Sharma, Brian C. Russo, Maricarmen Ro- 
jas-Lopez, Moshe Sade-Feldman, Kasidet Manakongtreecheep, Jessica Tan- 
tivit, and Molly Fisher Thomas. 


ACKNOWLEDGMENTS 


This project was started in April 2020 while SARS-CoV-2 was spreading fast in 
the United States and the people of Boston were sheltering in place. We are 
grateful to the tremendous efforts of the supporting teams, both at the Broad 
Institute and the NEIDL, who facilitated our research under those challenging 
circumstances. We thank J. Leon, O. Mizrahi, and E. Normandin for technical 
assistance; D.R. Mani and H. Metsky for computational assistance; and 
T. Ouspenskaia and P.M. Jean-Beltran for insightful discussions. We thank 
the Maus Lab volunteers who collected samples: E. Silva, K. Grauwet, and 
M. Jan. This study was supported in part by grants from the National Institute 
of Allergy and Infectious Diseases (U19AI110818 to P.C.S. and U19A1082630 
to N.H.) and National Cancer Institute Clinical Proteomic Tumor Analysis Con- 
sortium grants (NIH/NCI U24-CA210986 and NIH/NCI UO1 CA214125 to 
S.A.C. and NIH/NCI U24CA210979 to D.R.M.). S.W.-G. is the recipient of a Hu- 
man Frontier Science Program fellowship (LT-000396/2018), EMBO non- 


Cell 


stipendiary long-term fellowship (ALTF 883-2017), the Gruss-Lipper postdoc- 
toral fellowship, the Zuckerman STEM Leadership Program fellowship, and the 
Rothschild postdoctoral fellowship. S.K. is a Cancer Research Institute/Hearst 
Foundation fellow. C.H.T.T. was supported by a National Science Foundation 
graduate research fellowship (1745303). M.G. is the recipient of an EMBO 
long-term fellowship (ALTF 486-2018) and a Cancer Research Institute/Bris- 
tol-Myers Squibb fellow (CRI2993). N.B. is an extramural member of the Parker 
Institute for Cancer Immunotherapy. C.C.B is a Parker Institute for Cancer 
Immunotherapy Bridge Fellow. D.B.K. acknowledges funding support from 
Emerson Collective and NIH/NCI R21 CA216772-01A1. C.M.R is supported 
by the G. Harold and Leila Y. Mathers Charitable Foundation, the Bawd Foun- 
dation, and NIH NIAID 3 RO1-Al091707-10S1. M.S. is supported by Boston 
University startup funds. The MGH/MassCPR COVID biorepository was sup- 
ported by a gift from Ms. Enid Schwartz, by the Mark and Lisa Schwartz Foun- 
dation, the Massachusetts Consortium for Pathogen Readiness, and the Ra- 
gon Institute of MGH, MIT and Harvard. This project has been funded in part 
with federal funds from the Frederick National Laboratory for Cancer Research 
under contract No. HHSN261200800001E. The content of this publication 
does not necessarily reflect the views or policies of the Department of Health 
and Human Services, nor does mention of trade names, commercial products, 
or organizations imply endorsement by the US Government. This Research 
was supported in part by the Intramural Research Program of the NIH, Freder- 
ick National Laboratory, Center for Cancer Research. 


AUTHOR CONTRIBUTIONS 


S.W.-G., S.K., S.S., and J.G.A. conceptualized the study. S.W.-G., S.K., 
J.G.A., C.M.R., and M.S. designed the experiments. S.K., S.S., L.R.P., 
D.-Y.C., K.M.E.G., M.R.B., H.B.T., C.T., J.S., S.R., H.L.C., K.K., Y.W., 
D.L.-E., M.R.D., K.D.R., I.P.C., V.A.C., A.C., C.C.B., M.C., D.B.K., J.G.A., 
and M.S. performed experiments. S.W.-G., S.K., S.S., M.R.B., W.A.D., 
D.C.P., C.H.T.-T., Y.F., A.N., M.G., K.R.C., and J.G.A. performed data anal- 
ysis. N.B., D.H.B., A.S., M.V.M., D.B.K., D.C.P., N.H., S.A.C., M.S., and 
P.C.S. supervised experiments. S.W.-G., S.K., S.S., K.R.C., N.H., S.A.C., 
J.G.A., M.S., and P.C.S. wrote the manuscript with comments from all authors. 


DECLARATION OF INTERESTS 


S.W.-G., S.K., S.S., K.R.C, N.H., S.A.C., J.G.A., M.S., and P.C.S. are named 
co-inventors on a patent application related to immunogenic compositions 
of this manuscript filed by The Broad Institute that is being made available in 
accordance with the COVID-19 technology licensing framework to maximize 
access to university innovations. D.L.-E., C.T., Y.W., M.R.D., W.A.D., and 
D.C.P. are employees and stockholders of Repertoire Immune Medicines. 
N.B. is an extramural member of the Parker Institute for Cancer Immuno- 
therapy; receives research funds from Regeneron, Harbor Biomedical, DC 
Prime, and Dragonfly Therapeutics; and is on the advisory boards of Neon 
Therapeutics, Novartis, Avidea, Boehringer Ingelheim, Rome Therapeutics, 
Roswell Park Comprehensive Cancer Center, BreakBio, Carisma Therapeu- 
tics, CureVac, Genotwin, BioNTech, Gilead Therapeutics, Tempest Therapeu- 
tics, and the Cancer Research Institute. A.S. is a consultant for Gritstone, Flow 
Pharma, CellCarta, Oxfordlmmunotech, Immunoscape, and Avalia. La Jolla 
Institute for Immunology has filed for patent protection for various aspects of 
T cell epitope and vaccine design work. D.B.K. has previously advised Neon 
Therapeutics and has received consulting fees from Neon Therapeutics. 
D.B.K. owns equity in AduroBiotech, Agenus Inc., Armata Pharmaceuticals, 
Breakbio Corp., Biomarin Pharmaceutical Inc., Bristol Myers Squibb Com., 
Celldex Therapeutics Inc., Editas Medicine Inc., Exelixis Inc., Gilead Sciences 
Inc., IMV Inc., Lexicon Pharmaceuticals Inc., Moderna Inc., and Regeneron 
Pharmaceuticals. D.B.K. receives SARS-CoV-2 research support from Bei- 
Gene for a project unrelated to this publication. N.H. is a founder of Neon Ther- 
apeutics, Inc. (now BioNTech US), was a member of its scientific advisory 
board, and holds shares. N.H. is also an advisor for IFM Therapeutics. 
S.A.C. is a member of the scientific advisory boards of Kymera, PTM BioLabs, 
and Seer and a scientific advisor to Pfizer and Biogen. J.G.A. is a past 
employee and shareholder of Neon Therapeutics, Inc. (now BioNTech US). 


€? CelPress 


P.C.S.is a co-founder and shareholder of Sherlock Biosciences and a non-ex- 
ecutive board member and shareholder of Danaher Corporation. 


INCLUSION AND DIVERSITY 


One or more of the authors of this paper self-identifies as an underrepresented 
ethnic minority in science. One or more of the authors of this paper self-iden- 
tifies as a member of the LGBTQ- community. One or more of the authors of 
this paper self-identifies as living with a disability. 


Received: October 12, 2020 
Revised: April 21, 2021 
Accepted: May 27, 2021 
Published: June 3, 2021 


REFERENCES 


Abelin, J.G., Keskin, D.B., Sarkizova, S., Hartigan, C.R., Zhang, W., Sidney, J., 
Stevens, J., Lane, W., Zhang, G.L., Eisenhaure, T.M., et al. (2017). Mass Spec- 
trometry Profiling of HLA-Associated Peptidomes in Mono-allelic Cells En- 
ables More Accurate Epitope Prediction. Immunity 46, 315-326. 


Altman, J.D., and Davis, M.M. (2016). MHC-Peptide Tetramers to Visualize An- 
tigen-Specific T Cells. Curr. Protoc. Immunol. 115, 17.3.1-17.3.44. 


Altmann, D.M., and Boyton, R.J. (2020). SARS-CoV-2 T cell immunity: Speci- 
ficity, function, durability, and role in protection. Sci. Immunol. 5, eabd6160. 


Aran, D., Looney, A.P., Liu, L., Wu, E., Fong, V., Hsu, A., Chak, S., Naikawadi, 
R.P., Wolters, P.J., Abate, A.R., et al. (2019). Reference-based analysis of lung 
single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Im- 
munol. 20, 163-172. 


Bassani-Sternberg, M., and Gfeller, D. (2016). Unsupervised HLA Peptidome 
Deconvolution Improves Ligand Prediction Accuracy and Predicts Coopera- 
tive Effects in Peptide-HLA Interactions. J. Immunol. 197, 2492-2499. 


Callaway, E. (2020). The race for coronavirus vaccines: a graphical guide. Na- 
ture 580, 576-577. 


Campbell, K.M., Steiner, G., Wells, D.K., Ribas, A., and Kalbasi, A. (2020). Pre- 
diction of SARS-CoV-2 epitopes across 9360 HLA class | alleles. bioRxiv. 
https://doi.org/10.1101/2020.03.30.016931. 


Chen, D.-Y., Khan, N., Close, B.J., Goel, R.K., Blum, B., Tavares, A.H., Kenney, 
D., Conway, H.L., Ewoldt, J.K., Kapell, S., et al. (2020a). SARS-CoV-2 desen- 
sitizes host cells to interferon through inhibition of the JAK-STAT pathway. bio- 
Rxiv. https://doi.org/10.1101/2020.10.27.358259. 


Chen, J., Brunner, A.-D., Cogan, J.Z., Nufiez, J.K., Fields, A.P., Adamson, B., 
Itzhak, D.N., Li, J.Y., Mann, M., Leonetti, M.D., and Weissman, J.S. (2020b). 
Pervasive functional translation of noncanonical human open reading frames. 
Science 367, 1140-1146. 


Chen, R.E., Zhang, X., Case, J.B., Winkler, E.S., Liu, Y., VanBlargan, L.A., Liu, 
J., Errico, J.M., Xie, X., Suryadevara, N., et al. (2021). Resistance of SARS- 
CoV-2 variants to neutralization by monoclonal and serum-derived polyclonal 
antibodies. Nat. Med. 27, 717-726. 


Cheng, Y., and Prusoff, W.H. (1973). Relationship between the inhibition con- 
stant (K1) and the concentration of inhibitor which causes 50 per cent inhibition 
(150) of an enzymatic reaction. Biochem. Pharmacol. 22, 3099-3108. 


Chong, C., Marino, F., Pak, H., Racle, J., Daniel, R.T., Müller, M., Gfeller, D., 
Coukos, G., and Bassani-Sternberg, M. (2018). High-throughput and Sensitive 
Immunopeptidomics Platform Reveals Profound Interferony-Mediated Re- 
modeling of the Human Leukocyte Antigen (HLA) Ligandome. Mol. Cell. Prote- 
omics 17, 533—548. 


Croft, N.P., Smith, S.A., Wong, Y.C., Tan, C.T., Dudek, N.L., Flesch, I.E.A., Lin, 
L.C.W., Tscharke, D.C., and Purcell, A.W. (2013). Kinetics of antigen expres- 
sion and epitope presentation during virus infection. PLoS Pathog. 9, 
e1003129. 


Dan, J.M., Mateus, J., Kato, Y., Hastie, K.M., Yu, E.D., Faliti, C.E., Grifoni, A., 
Ramirez, S.l., Haupt, S., Frazier, A., et al. (2020). Immunological memory to 


Cell 184, 3962-3980, July 22, 2021 3977 








© CelPress 


OPEN ACCESS 


SARS-CoV-2 assessed for up to eight months after infection. bioRxiv. https:// 
doi.org/10.1101/2020.11.15.383323. 


Dawson, D.V., Ozgur, M., Sari, K., Ghanayem, M., and Kostyu, D.D. (2001). 
Ramifications of HLA class | polymorphism and population genetics for vac- 
cine development. Genet. Epidemiol. 20, 87-106. 


Demmers, L.C., Heck, A.J.R., and Wu, W. (2019). Pre-fractionation Extends 
but also Creates a Bias in the Detectable HLA Class | Ligandome. 
J. Proteome Res. 18, 1634-1643. 


Dominguez Andres, A., Feng, Y., Campos, A.R., Yin, J., Yang, C.-C., James, 
B., Murad, R., Kim, H., Deshpande, A.J., Gordon, D.E., et al. (2020). SARS- 
CoV-2 ORF9c Is a Membrane-Associated Protein that Suppresses Antiviral 
Responses in Cells. bioRxiv. https://doi.org/10.1101/2020.08.18.256776. 


Dutta, N.K., Mazumdar, K., and Gordy, J.T. (2020). The Nucleocapsid Protein 
of SARS-CoV-2: a Target for Vaccine Development. J. Virol. 94, e00647, 20. 


Erhard, F., Halenius, A., Zimmermann, C., L'Hernault, A., Kowalewski, D.J., 
Weekes, M.P., Stevanovic, S., Zimmer, R., and Dólken, L. (2018). Improved 
Ribo-seq enables identification of cryptic translation events. Nat. Methods 
15, 363-366. 


Ferretti, A.P., Kula, T., Wang, Y., Nguyen, D.M.V., Weinheimer, A., Dunlap, 
G.S., Xu, Q., Nabilsi, N., Perullo, C.R., Cristofaro, A.W., et al. (2020). Unbiased 
Screens Show CD8* T Cells of COVID-19 Patients Recognize Shared Epitopes 
in SARS-CoV-2 that Largely Reside outside the Spike Protein. Immunity 53, 
1095-1107.e3. 


Finkel, Y., Schmiedel, D., Tai-Schmiedel, J., Nachshon, A., Winkler, R., Dobe- 
sova, M., Schwartz, M., Mandelboim, O., and Stern-Ginossar, N. (20202). 
Comprehensive annotations of human herpesvirus 6A and 6B genomes reveal 
novel and conserved genomic features. eLife 9, e50960. 


Finkel, Y., Mizrahi, O., and Nachshon, A. (2020b). The coding capacity of 
SARS-CoV-2. Nature 589, 125-130. 


Finkel, Y., Gluck, A., Winkler, R., Nachshon, A., and Mizrahi, O. (2020c). SARS- 
CoV-2 utilizes a multipronged strategy to suppress host protein synthesis. Cell 
183, 1325-1339.e21. 


Francis, J.M., Leistritz-Edwards, D., Dunn, A., Tarr, C., Lehman, J., Dempsey, 
C., Hamel, A., Rayon, V., Liu, G., Wang, Y., et al. (2021). Allelic variation in Class 
| HLA determines pre-existing memory responses to SARS-CoV-2 that shape 
the CD8- T cell repertoire upon viral exposure. bioRxiv. https://doi.org/10. 
1101/2021.04.29.441258. 


Gallagher, K.M.E., Leick, M.B., Larson, R.C., Berger, T.R., Katsis, K., Yam, 
J.Y., Brini, G., Grauwet, K., and MGH COVID-19 Collection & Processing 
Team, and Maus, M.V. (2021). SARS -CoV-2 T-cell immunity to variants of 
concern following vaccination. bioRxiv. https://doi.org/10.1101/2021.05.03. 
442455. 


Gordon, D.E., Jang, G.M., Bouhaddou, M., Xu, J., Obernier, K., White, K.M., 
O'Meara, M.J., Rezelj, V.V., Guo, J.Z., Swaney, D.L., et al. (2020). A SARS- 
CoV-2 protein interaction map reveals targets for drug repurposing. Nature 
583, 459-468. 


Grifoni, A., Weiskopf, D., Ramirez, S.I., Mateus, J., Dan, J.M., Moderbacher, 
C.R., Rawlings, S.A., Sutherland, A., Premkumar, L., Jadi, R.S., et al. 
(2020a). Targets of T Cell Responses to SARS-CoV-2 Coronavirus in Humans 
with COVID-19 Disease and Unexposed Individuals. Cell 787, 1489-1501.e15. 


Grifoni, A., Sidney, J., Zhang, Y., Scheuermann, R.H., Peters, B., and Sette, A. 
(2020b). A Sequence Homology and Bioinformatic Approach Can Predict 
Candidate Targets for Immune Responses to SARS-CoV-2. Cell Host Microbe 
27, 671-680.e2. 

Gulukota, K., Sidney, J., Sette, A., and DeLisi, C. (1997). Two complementary 
methods for predicting peptides binding major histocompatibility complex 
molecules. J. Mol. Biol. 267, 1258-1267. 

Habel, J.R., Nguyen, T.H.O., van de Sandt, C.E., Juno, J.A., Chaurasia, P., 
Wragg, K., Koutsakos, M., Hensen, L., Jia, X., Chua, B., et al. (2020). Subop- 
timal SARS-CoV-2-specific CD8* T cell response associated with the promi- 
nent HLA-A*02:01 phenotype. Proc. Natl. Acad. Sci. USA 117, 24384-24391. 
Hansen, T.H., and Bouvier, M. (2009). MHC class | antigen presentation: 
learning from viral evasion strategies. Nat. Rev. Immunol. 9, 503-513. 


3978 Cell 184, 3962-3980, July 22, 2021 


Cell 


Article 


Hickman, H.D., Mays, J.W., Gibbs, J., Kosik, l., Magadan, J.G., Takeda, K., 
Das, S., Reynoso, G.V., Ngudiankama, B.F., Wei, J., et al. (2018). Influenza A 
Virus Negative Strand RNA Is Translated for CD8* T Cell Immunosurveillance. 
J. Immunol. 201, 1222-1228. 


Hie, B., Bryson, B., and Berger, B. (2019). Efficient integration of heteroge- 
neous single-cell transcriptomes using Scanorama. Nat. Biotechnol. 37, 
685-691. 


Ingolia, N.T., Ghaemmaghami, S., Newman, J.R.S., and Weissman, J.S. 
(2009). Genome-wide analysis in vivo of translation with nucleotide resolution 
using ribosome profiling. Science 324, 218-223. 


Ingolia, N.T., Lareau, L.F., and Weissman, J.S. (2011). Ribosome profiling of 
mouse embryonic stem cells reveals the complexity and dynamics of mamma- 
lian proteomes. Cell 147, 789-802. 


Ingolia, N.T., Brar, G.A., Stern-Ginossar, N., Harris, M.S., Talhouarne, G.J.S., 
Jackson, S.E., Wills, M.R., and Weissman, J.S. (2014). Ribosome profiling re- 
veals pervasive translation outside of annotated protein-coding genes. Cell 
Rep. 8, 1365-1379. 


Jackson, L.A., Anderson, E.J., Rouphael, N.G., Roberts, P.C., Makhene, M., 
Coler, R.N., McCullough, M.P., Chappell, J.D., Denison, M.R., Stevens, L.J., 
et al. (2020). An mRNA vaccine against SARS-CoV-2— preliminary report. 
N. Engl. J. Med. 383, 1920-1931. 


Jungreis, l., Nelson, C.W., Ardern, Z., Finkel, Y., Krogan, N.J., Sato, K., Zie- 
buhr, J., Stern-Ginossar, N., Pavesi, A., Firth, A.E., et al. (2021). Conflicting 
and ambiguous names of overlapping ORFs in the SARS-CoV-2 genome: 
A homology-based resolution. Virology 558, 145-151. 


Kared, H., Redd, A.D., Bloch, E.M., Bonny, T.S., Sumatoh, H., Kairi, F., Car- 
bajo, D., Abel, B., Newell, E.W., Bettinotti, M.P., et al. (2021). SARS-CoV-2- 
specific CD8+ T cell responses in convalescent COVID-19 individuals. 
J. Clin. Invest. 131, 145476. 


Keskin, D.B., Reinhold, B.B., Zhang, G.L., Ivanov, A.R., Karger, B.L., and Rein- 
herz, E.L. (2015). Physical detection of influenza A epitopes identifies a stealth 
subset on human lung epithelium evading natural CD8 immunity. Proc. Natl. 
Acad. Sci. USA 112, 2151-2156. 


Ketteler, R. (2012). On programmed ribosomal frameshifting: the alternative 
proteomes. Front. Genet. 3, 242. 


Kim, D., Lee, J.-Y., Yang, J.-S., Kim, J.W., Kim, V.N., and Chang, H. (2020). The 
Architecture of SARS-CoV-2 Transcriptome. Cell 787, 914-921.e10. 


Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. (2009). Ultrafast and 
memory-efficient alignment of short DNA sequences to the human genome. 
Genome Biol. 10, R25. 


Le Bert, N., Tan, A.T., Kunasegaran, K., Tham, C.Y.L., Hafezi, M., Chia, A., 
Chng, M.H.Y., Lin, M., Tan, N., Linster, M., et al. (2020). SARS-CoV-2-specific 
T cell immunity in cases of COVID-19 and SARS, and uninfected controls. Na- 
ture 584, 457-462. 


Ledford, H. (2021). How ‘killer’ T cells could boost COVID immunity in face of 
new variants. Nature 590, 374-375. 


Lu, R., Zhao, X., Li, J., Niu, P., Yang, B., Wu, H., Wang, W., Song, H., Huang, B., 
Zhu, N., et al. (2020). Genomic characterisation and epidemiology of 2019 
novel coronavirus: implications for virus origins and receptor binding. Lancet 
395, 565-574. 


Lunemann, S., Schóbel, A Kah, J., Fittje, P., Hólzemer, A., Langeneckert, 
A.E., Hess, L.U., Poch, T., Martrus, G., Garcia-Beltran, W.F., et al. (2018). In- 
teractions Between KIRSDS1 and HLA-F Activate Natural Killer Cells to Control 
HCV Replication in Cell Culture. Gastroenterology 155, 1366-1371.e3. 


Maness, N.J., Walsh, A.D., Piaskowski, S.M., Furlott, J., Kolar, H.L., Bean, 
A.T., Wilson, N.A., and Watkins, D.I. (2010). CD8- T cell recognition of cryptic 
epitopes is a ubiquitous feature of AIDS virus infection. J. Virol. 84, 
11569-11574. 


McMurtrey, C.P., Lelic, A., Piazza, P., Chakrabarti, A.K., Yablonsky, E.J., Wahl, 
A., Bardet, W., Eckerd, A., Cook, R.L., Hess, R., et al. (2008). Epitope discovery 
in West Nile virus infection: Identification and immune recognition of viral epi- 
topes. Proc. Natl. Acad. Sci. USA 105, 2981-2986. 


Cell 
Article 


Mulligan, M.J., Lyke, K.E., Kitchin, N., Absalon, J., Gurtman, A., Lockhart, S., 
Neuzil, K., Raabe, V., Bailey, R., Swanson, K.A., et al. (2020). Phase I/II study of 
COVID-19 RNA vaccine BNT162b1 in adults. Nature 586, 589-593. 


Neefjes, J., Jongsma, M.L.M., Paul, P., and Bakke, O. (2011). Towards a sys- 
tems understanding of MHC class | and MHC class ll antigen presentation. 
Nat. Rev. Immunol. 77, 823-836. 


Nguyen, A., David, J.K., Maden, S.K., Wood, M.A., Weeder, B.R., Nellore, A., 
and Thompson, R.F. (2020). Human leukocyte antigen susceptibility map for 
SARS-CoV-2. J. Virol. 94, e00510, 20. 


Ouspenskaia, T., Law, T., Clauser, K.R., Klaeger, S., Sarkizova, S., Aguet, F., 
Li, B., Christian, E., Knisbacher, B.A., Le, P.M., et al. (2020). Thousands of 
novel unannotated proteins expand the MHC | immunopeptidome in cancer. 
bioRxiv. https://doi.org/10.1101/2020.02.12.945840. 


Park, M.D. (2020). Immune evasion via SARS-CoV-2 ORF8 protein? Nat. Rev. 
Immunol. 20, 408. 


Poran, A., Harjanto, D., Malloy, M., Arieta, C.M., Rothenberg, D.A., Lenkala, D., 
van Buuren, M.M., Addona, T.A., Rooney, M.S., Srinivasan, L., and Gaynor, 
R.B. (2020). Sequence-based prediction of SARS-CoV-2 vaccine targets using 
a mass spectrometry-based bioinformatics predictor identifies immunogenic 
T cell epitopes. Genome Med. 72, 70. 


Rambaut, A., Holmes, E.C., O’Toole, Á., Hill, V., McCrone, J.T., Ruis, C., du 
Plessis, L., and Pybus, O.G. (2020). A dynamic nomenclature proposal for 
SARS-CoV-2 lineages to assist genomic epidemiology. Nat. Microbiol. 5, 
1403-1407. 


Redd, A.D., Nardin, A., Kared, H., Bloch, E.M., Pekosz, A., Laeyendecker, O., 
Abel, B., Fehlings, M., Quinn, T.C., and Tobian, A.A. (2021). CD8+ T cell re- 
sponses in COVID-19 convalescent individuals target conserved epitopes 
from multiple prominent SARS-CoV-2 circulating variants. medRxiv. https:// 
doi.org/10.1101/2021.02.11.21251585. 


Rucevic, M., Kourjian, G., Boucau, J., Blatnik, R., Garcia Bertran, W., Berber- 
ich, M.J., Walker, B.D., Riemer, A.B., and Le Gall, S. (2016). Analysis of Major 
Histocompatibility Complex-Bound HIV Peptides Identified from Various Cell 
Types Reveals Common Nested Peptides and Novel T Cell Responses. 
J. Virol. 90, 8605-8620. 


Ruiz Cuevas, M.V., Hardy, M.-P., Holly, J., Bonneil, É., Durette, C., Courcelles, 
M., Lanoix, J., Cóté, C., Staudt, L.M., Lemieux, S., et al. (2021). Most non-ca- 
nonical proteins uniquely populate the proteome or immunopeptidome. Cell 
Rep. 34, 108815. 


Rydyznski Moderbacher, C., Ramirez, S.I., Dan, J.M., Grifoni, A., Hastie, K.M., 
Weiskopf, D., Belanger, S., Abbott, R.K., Kim, C., Choi, J., et al. (2020). Anti- 
gen-specific adaptive immunity to SARS-CoV-2 in acute COVID-19 and asso- 
ciations with age and disease severity. Cell 183, 996-1012.e19. 


Saini, S.K., Hersby, D.S., Tamhane, T., Povlsen, H.R., Amaya Hernandez, S.P., 
Nielsen, M., Gang, A.O., and Hadrup, S.R. (2020). SARS-CoV-2 genome-wide 
mapping of CD8 T cell recognition reveals strong immunodominance and sub- 
stantial CD8 T cell activation in COVID-19 patients. Sci. Immunol 6, eabf7550. 


Sarkizova, S., Klaeger, S., Le, P.M., Li, L.W., Oliveira, G., Keshishian, H., Har- 
tigan, C.R., Zhang, W., Braun, D.A., Ligon, K.L., et al. (2020). A large peptidome 
dataset improves HLA class | epitope prediction across most of the human 
population. Nat. Biotechnol. 38, 199-209. 


Schellens, I.M., Meiring, H.D., Hoof, l., Spijkers, S.N., Poelen, M.C.M., van 
Gaans-van den Brink, J.A.M., Costa, A.l., Vennema, H., Keşmir, C., van Baarle, 
D., and van Els, C.A. (2015). Measles Virus Epitope Presentation by HLA: Novel 
Insights into Epitope Selection, Dominance, and Microvariation. Front. Immu- 
nol. 6, 546. 


Schmidt, N., Lareau, C.A., Keshishian, H., Ganskih, S., Schneider, C., Hennig, 
T., Melanson, R., Werner, S., Wei, Y., Zimmer, M., et al. (2020). The SARS- 
CoV-2 RNA-protein interactome in infected human cells. Nat Microbiol. 6, 
339-353. 


Schubert, K., Karousis, E.D., Jomaa, A., Scaiola, A., Echeverria, B., Gurzeler, 
L.-A., Leibundgut, M., Thiel, V., Mühlemann, O., and Ban, N. (2020). SARS- 
CoV-2 Nsp1 binds the ribosomal mRNA channel to inhibit translation. Nat. 
Struct. Mol. Biol. 27, 959-966. 


€? CelPress 


OPEN ACCESS 


Schwanhausser, B., Busse, D., Li, N., Dittmar, G., Schuchhardt, J., Wolf, J., 
Chen, W., and Selbach, M. (2011). Global quantification of mammalian gene 
expression control. Nature 473, 337-342. 


Sekine, T., Perez-Potti, A., Rivera-Ballesteros, O., Strálin, K., Gorin, J.-B., Ols- 
son, A., Llewellyn-Lacey, S., Kamal, H., Bogdanovic, G., Muschiol, S., et al. 
(2020). Robust T cell immunity in convalescent individuals with asymptomatic 
or mild COVID-19. Cell 783, 158-168.e14. 


Shomuradova, A.S., Vagida, M.S., Sheetikov, S.A., Zornikova, K.V., Kiryukhin, 
D., Titov, A., Peshkova, LO. Khmelevskaya, A., Dianov, D.V., Malasheva, M., 
et al. (2020). SARS-CoV-2 Epitopes Are Recognized by a Public and Diverse 
Repertoire of Human T Cell Receptors. Immunity 53, 1245-1257.e5. 


Sidney, J., Southwood, S., Moore, C., Oseroff, C., Pinilla, C., Grey, H.M., and 
Sette, A. (2013). Measurement of MHC/peptide interactions by gel filtration or 
monoclonal antibody capture. Curr. Protoc. Immunol., Chapter 18, Unit 18.3. 


Solberg, O.D., Mack, S.J., Lancaster, A.K., Single, R.M., Tsai, Y., Sanchez- 
Mazas, A., and Thomson, G. (2008). Balancing selection and heterogeneity 
across the classical human leukocyte antigen loci: a meta-analytic review of 
497 population studies. Hum. Immunol. 69, 443-464. 


Sonenberg, N., and Hinnebusch, A.G. (2009). Regulation of translation initia- 
tion in eukaryotes: mechanisms and biological targets. Cell 736, 731-745. 


Starck, S.R., and Shastri, N. (2016). Nowhere to hide: unconventional transla- 
tion yields cryptic peptides for immune surveillance. Immunol. Rev. 272, 8-16. 


Stern-Ginossar, N., Weisburd, B., Michalski, A., Le, V.T.K., Hein, M.Y., Huang, 
S.-X., Ma, M., Shen, B., Qian, S.-B., Hengel, H., et al. (2012). Decoding human 
cytomegalovirus. Science 338, 1088-1093. 


Stukalov, A., Girault, V., Grass, V., Bergant, V., Karayel, O., Urban, C., Haas, 
D.A., Huang, Y., Oubraham, L., Wang, A., et al. (2020). Multi-level proteomics 
reveals host-perturbation strategies of SARS-CoV-2 and SARS-CoV. bioRxiv. 
https://doi.org/10.1101/2020.06.17.156455. 


Su, Y., Chen, D., Yuan, D., Lausted, C., Choi, J., Dai, C.L., Voillet, V., Duvvuri, 
V.R., Scherler, K., Troisch, P., et al.; ISB-Swedish COVID19 Biobanking Unit 
(2020). Multi-Omics Resolves a Sharp Disease-State Shift between Mild and 
Moderate COVID-19. Cell 183, 1479-1495.e20. 


Takagi, A., and Matsui, M. (2020). Identification of HLA-A*02:01-restricted 
candidate epitopes derived from the non-structural polyprotein 1a of SARS- 
CoV-2 that may be natural targets of CD8* T cell recognition in vivo. J. Virol. 
95, e01837, 20. 


Tarke, A., Sidney, J., Kidd, C.K., Dan, J.M., Ramirez, S.I., Yu, E.D., Mateus, J., 
da Silva Antunes, R., Moore, E., Rubiro, P., et al. (2020). Comprehensive 
analysis of T cell immunodominance and immunoprevalence of SARS- 
CoV-2 epitopes in COVID-19 cases. bioRxiv. https://doi.org/10.1101/2020. 
12.08.416750. 


Tarke, A., Sidney, J., Methot, N., Zhang, Y., Dan, J.M., Goodwin, B., Rubiro, P., 
Sutherland, A., da Silva Antunes, R., Frazier, A., et al. (2021). Negligible impact 
of SARS-CoV-2 variants on CD4+ and CD8+ T cell reactivity in COVID-19 
exposed donors and vaccinees. bioRxiv. https://doi.org/10.1101/2021.02. 
27.433180. 


Ternette, N., Yang, H., Partridge, T., Llano, A., Cedeno, S., Fischer, R., Charles, 
P.D., Dudek, N.L., Mothe, B., Crespo, M., et al. (2016). Defining the HLA class 
l-associated viral antigen repertoire from HIV-1-infected human cells. Eur. J. 
Immunol. 46, 60-69. 

Thompson, A., Scháfer, J., Kuhn, K., Kienle, S., Schwarz, J., Schmidt, G., Neu- 
mann, T., Johnstone, R., Mohammed, A.K.A., and Hamon, C. (2003). Tandem 
mass tags: a novel quantification strategy for comparative analysis of complex 
protein mixtures by MS/MS. Anal. Chem. 75, 1895-1904. 

Tyanova, S., Temu, T., Sinitcyn, P., Carlson, A., Hein, M.Y., Geiger, T., Mann, 
M., and Cox, J. (2016). The Perseus computational platform for comprehen- 
sive analysis of (prote)omics data. Nat. Methods 13, 731-740. 

van Dijk, D., Sharma, R., Nainys, J., Yim, K., Kathail, P., Carr, A.J., Burdziak, C., 
Moon, K.R., Chaffer, C.L., Pattabiraman, D., et al. (2018). Recovering Gene In- 
teractions from Single-Cell Data Using Data Diffusion. Cell 174, 716-729.e27. 
Weiskopf, D., Schmitz, K.S., Raadsen, M.P., Grifoni, A., Okba, N.M.A., Ende- 
man, H., van den Akker, J.P.C., Molenkamp, R., Koopmans, M.P.G., van Gorp, 


Cell 184, 3962-3980, July 22, 2021 3979 








© CelPress 


OPEN ACCESS 


E.C.M., et al. (2020). Phenotype and kinetics of SARS-CoV-2-specific T cells in 
COVID-19 patients with acute respiratory distress syndrome. Sci. Immunol. 5, 
eabd2071. 

Wolf, F.A., Angerer, P., and Theis, F.J. (2018). SCANPY: large-scale single-cell 
gene expression data analysis. Genome Biol. 79, 15. 

Wu, T., Guan, J., Handel, A., Tscharke, D.C., Sidney, J., Sette, A., Wakim, L.M., 
Sng, X.Y.X., Thomas, P.G., Croft, N.P., et al. (2019). Quantification of epitope 
abundance reveals the effect of direct and cross-presentation on influenza 
CTL responses. Nat. Commun. 10, 2846. 

Wu, F., Zhao, S., Yu, B., Chen, Y.-M., Wang, W., Song, Z.-G., Hu, Y., Tao, Z.- 
W., Tian, J.-H., Pei, Y.-Y., et al. (2020). Anew coronavirus associated with hu- 
man respiratory disease in China. Nature 579, 265-269. 

Wu, K., Werner, A.P., Koch, M., Choi, A., Narayanan, E., Stewart-Jones, 
G.B.E., Colpitts, T., Bennett, H., Boyoglu-Barnum, S., Shi, W., et al. (2021). 


3980 Cell 184, 3962-3980, July 22, 2021 


Cell 


Article 


Serum Neutralizing Activity Elicited by mRNA-1273 Vaccine - Preliminary 
Report. N. Engl. J. Med. 384, 1468-1470. 


Yang, N., Gibbs, J.S., Hickman, H.D., Reynoso, G.V., Ghosh, A.K., Bennink, 
J.R., and Yewdell, J.W. (2016). Defining Viral Defective Ribosomal Products: 
Standard and Alternative Translation Initiation Events Generate a Common 
Peptide from Influenza A Virus M2 and M1 mRNAs. J. Immunol. 196, 
3608-3617. 

Zhang, Y., Zhang, J., Chen, Y., Luo, B., Yuan, Y., Huang, F., Yang, T., Yu, F., 
Liu, J., Liu, B., et al. (2020). The ORF8 Protein of SARS-CoV-2 Mediates Im- 
mune Evasion through Potently Downregulating MHC-l. bioRxiv. https://doi. 
org/10.1101/2020.05.24.111823. 

Zhu, M.-S., Pan, Y., Chen, H.-Q., Shen, Y., Wang, X.-C., Sun, Y.-J., and Tao, 
K.-H. (2004). Induction of SARS-nucleoprotein-specific immune response by 
use of DNA vaccine. Immunol. Lett. 92, 237-243. 


Cell 


STAR* METHODS 


KEY RESOURCES TABLE 


REAGENT or RESOURCE 
Antibodies 
HLA Class | antibody W6/32 for HLA-IP 


HLA Class | antibody W6/32 for in-vitro binding assay 
SARS-CoV Nucleocapsid (N) Protein (RABBIT) polyclonal antibody 


Alexa Fluor 568 goat anti rabbit antibody 


anti mouse IFNy mAB AN18 purified 250 ug (1mg/mL) 
anti-mouse IFNy mAb R4-6A2, biotinylated 250ug (1mg/mL) 


MS CDSE pure mAB 0.5 mg/mL 
Bacterial and virus strains 
SARS-CoV-2 


Biological samples 


Convalescent donor blood sample 
Convalescent donor blood sample 
Convalescent donor blood sample 


Convalescent donor blood sample 


Chemicals, peptides, and recombinant proteins 
1M Tris, pH 8.0 

EDTA 

Sodium chloride 

Triton-X 

Octyl B-d-glucopyranoside 
Phenylmethanesulfonyl fluoride 
COmplete Protease Inhibitor Tablet-EDTA free 
Aprotinin 

Leupeptin 

Sodium fluoride 

Phosphatase inhibitor cocktail 2 
Phosphatase inhibitor cocktail 3 
DNase | 

Gammabind Plus Sepharose beads 
Dithiothretiol, No-Weigh Format 
lodoacetamide 

Lysyl endopeptidase (LysC), 
Trypsin, Mass Spec Grade 

Formic Acid 

Acetonitrile 

Trifluoretic acid 


SOURCE 


Santa Cruz Biotechnology 
ATCC 

Rockland 

Invitrogen 

Mabtech 

Mabtech 

BD Biosciences 


Centers for Disease 
Control and Prevention 
and BEI Resources 


Precision for Medicine (USA) 
Sanguine (USA) 
CTL (USA) 


The Immune Monitoring 
Laboratory, MGH 


Invitrogen 
Sigma Aldrich 
Sigma Aldrich 
Sigma Aldrich 
Sigma Aldrich 
Sigma Aldrich 
Sigma Aldrich 
Sigma Aldrich 
Roche 

Sigma Aldrich 
Sigma Aldrich 
Sigma Aldrich 
Sigma Aldrich 
Millipore Sigma 
Fisher Scientific 
Sigma Aldrich 
Wako Chemicals 
Promega 
Sigma Aldrich 
Honeywell 
Sigma Aldrich 


€? Cell?re S 


OADE N Il ACC EE 
VWF E | N ENK at 


IDENTIFIER 


Cat # sc-32235; RRID:AB 627934 
Cat # HB-95; RRID CVCL 7872 

Cat # 200-401-A50; RRID:AB. 828403 
Cat # A11011; RRID:AB 143187 

Cat # 3321-3-250; RRID:AB 907279 
Cat # 3321-6-250; RRID:AB 907271 
Cat # 553058, RRID:AB. 394591 


2019-nCoV/USA-WA1/2020 
isolate (NCBI accession 
number: MN985325.1) 


https://www.precisionformedicine.com/ 
https://sanguinebio.com/ 


http://www.immunospot.com/ 
ImmunoSpot-ePBMC 


https://www.massgeneral.org/ 
cancer-center/clinical-trials-and- 
research/immunotherapy/cellular- 
immunotherapy-program/immune- 
monitoring-laboratory 


Cat ££ AM9855G 
Cat #7789 

Cat #71376 

Cat #19284 

Cat # 08001 

Cat # 78830 

Cat # 5056489001 
Cat: # A6103 

Cat: # 11017101001 
Cat: #57920 

Cat: #P5726 

Cat: #P0044 

Cat # 4716728001 
Cat #17-0886-01 
Cat # 20291 

Cat: # A3221 

Cat # 129-02541 
Cat # V511X 

Cat # F0507 

Cat # 34967 

Cat # 302031 


(Continued on next page) 


Cell 184, 3962-3980.e1-e8, July 22, 2021 e 








© CelPress 


OPEN ACCESS 


Continued 

REAGENT or RESOURCE 

TMT sixplex Isobaric Label Reagent Set 
0.5M HEPES, pH 8.5 

Hydroxylamine solution, 50% (vol/vol) in H2O 
Methanol 

Ammonium hydroxide solution, 28% (wt/vol) in H2O 
Acetic acid, glacial 

Avicel 

Benzonase 

Synthetic peptides, 5mg > 90% purity 
SARS-CoV-2 Nucleocapsid peptides pool 
Human IFNy ELISpot 

Illumina TruSeq Stranded mRNA (LT) 

Agilent 2200 TapeStation D1000 ScreenTape 
NextSeq V2.5 High Output 75 cycle kit 
NextSeq V2.5 High Output 150 cycles kit 
PolyIC/LC, Hiltonol (2mg/ml) 

Streptavidin HRP 

Substrate for Elispot: TMB 

M. Tuberculosis H37 RA 

Adjuvant complete freund 

GIBCO Phytohemagglutinin, M form (PHA-M) 
Multiscreen IP WHT STRL 

SDB-XC disk, 47mm 

Critical commercial assays 

HLA-A,B,C typing 

Deposited data 

HNA sequencing data 

original mass spectra, peptide spectrum matches and databases 
Experimental models: Cell lines 

A549 

HEK293T 

VERO E6 

Experimental models: Organisms/strains 
B6.C g- Imm p 2 [|T9(HLA-A/H2-D)2Enge /J 

Software and algorithms 

Spectrum Mill software package v7.1 pre-Release 
Bowtie 1.1.2 


HLAthena binding prediction 
Other 

Easy-nLC 1200 System 
Orbitrap Exploris 480 
FAIMS Pro Interface 


Sequences, analyses, and barcoded tetramer 
pools used to determine peripheral T cell specificity 


e2 Cell 184, 3962-3980.e1-e8, July 22, 2021 


SOURCE 
ThermoFisher 
Alfa Aesar 

Sigma Aldrich 
Honeywell 

Sigma Aldrich 
Sigma Aldrich 
DuPont 

Thomas Scientific 
Genscript 

JPT 

CTL 

Illumina 

Agilent 

Illumina 

Illumina 

Oncovir 

Mabtech 
Mabtech 

DIFCO Laboratories 


Becton, Dickinson and Co 


GIBCO 
Thermo Fisher Scientific 
Empore 3M 


Histogenetics 


GEO 
MassIVE 


ATCC 
ATCC 
ATCC 


The Jackson Laboratory 


Broad Institute 
(Langmead et al., 2009) 


Broad Institute 


Thermo Fisher Scientific 
Thermo Fisher Scientific 
Thermo Fisher Scientific 
This paper 


IDENTIFIER 

Cat # 90061 

Cat # J63218 

Cat #467804 

Cat # 34966 

Cat # 338818 

Cat # AX0073 

Cat # RC-581 

Cat # E1014-25KU 
Customized quote 
PM-WCPV-NCAP 
Cat #hIFNgp-2M/2 
Cat # FC-122-2101 
Cat # 5067-5582 
Cat # 20024906 
Cat # 20024907 


Cat # 3310-9-1000 
Cat # 3651-10 

Cat # 231141 

Cat # 263810 

Cat #10576015 
Cat #MSIPS4W10 
Cat # 2240 


Customized quote 


GSE159191 
MSV000087225 


CCL-185 
CRL-3216 
CRL-1586 


Stock # 004191 


Cell 


https://proteomics.broadinstitute.org/ 


http://bowtie-bio.sourceforge.net/ 


index.shtml 


http://hlathena.tools/ 


LC140 
BRE725532 
FNS02-10001 
N/A 


Cell © CelPress 


RESOURCE AVAILABILITY 


Lead contact 
Further information and requests for resources and reagents should be directed to the lead contact, Shira Weingarten-Gabbay 
(shirawg@broadinstitute.org). 


Materials availability 
Cell lines transduced with ACE2 and TMPRSS2 are available upon request. 


Data and code availability 

The raw RNA sequencing data generated in this study have been submitted to the Gene Expression Omnibus (GEO; https://www. 
ncbi.nlm.nih.gov/geo/) under accession number GSE159191. The original mass spectra, peptide spectrum matches, and the protein 
sequence databases used for searches have been deposited in the public proteomics repository MassIVE (https://massive.ucsd. 
edu) and are accessible at ftp://MSV000087225@massive.ucsd.edu. 


EXPERIMENTAL MODEL AND SUBJECT DETAILS 


Human Subjects 

Peripheral blood samples for pooled ELISpot assays were collected from COVID-19 convalescent patients (2 male and 4 female) un- 
der informed consent. Ethical review for the collection of peripheral blood samples and the secondary use of the PBMCs was con- 
ducted by Partners Healthcare Services (PHS) Institutional Review Board (IRB protocol IDs: 2020P000804 and 2020P001 446). Pe- 
ripheral blood samples for multiplexed tetramer assays (9 male and 8 female) were purchased from Precision 4 Medicine (USA), 
Sanguine (USA), or CTL (USA). These companies ethically collected samples under informed consent or as clinical excess specimens 
under a waiver of consent. 


Cell culture 

Human embryonic kidney HEK293T cells (female), human lung A549 cells (male), and African green monkey kidney Vero E6 cells (fe- 
male) were maintained at 37°C and 5% CO2 in DMEM containing 10% FBS. We generated stable HEK293T and A549 cells express- 
ing human ACE2 and TMPRSS2 by transducing them with lentivirus particles carrying these two cDNAs. A549 cells express A*25:01/ 
30:01, B*18:01/44:03 and C*12:03/16:01, while HEK293T cells express A*02:01, B*07:02 and C*07:02 (determined by HLA typing, 
Histogenetics, USA). 


SARS-CoV-2 virus stock preparation 

The 2019-nCoV/USA-WA1/2020 isolate (NCBI accession number: MN985325) of SARS-CoV-2 was obtained from the Centers for 
Disease Control and Prevention and BEI Resources. To generate the virus P1 stock, we infected Vero E6 cells with this isolate for 
1h at 37°C, removed the virus inoculum, rinsed the cell monolayer with 1X PBS, and added DMEM supplemented with 2% FBS. Three 
days later, when the cytopathic effect of the virus became visible, we harvested the culture medium, passed through a 0.2. filter, and 
stored it at —80°C. To generate the virus P2 stock, we infected Vero E6 cells with the P1 stock at a multiplicity of infection (MOI) of 0.1 
plaque forming units (PFU)/cell and harvested the culture medium three days later using the same protocol as for the P1 stock. All 
experiments in this study were performed using the P2 stock. 


HLA-AO2 transgenic mice 

6-8 week old, female HLA-A2 transgenic AAD mice were used in the experiments (B6.Cg-/Immp2l'9"-^-^/H2-D?Enge, ] The Jackson 
Laboratory). These animals express a chimeric molecule, which contains peptide-binding «1 and «2 domains of the HLA-A2.1 mole- 
cule and the «3 domain of the mouse H-2 D? molecule, under the control of the HLA-A2.1 promoter in addition to mouse MHC H-2P. 
These animals allow the testing of human T cell immune responses to HLA-A2 presented antigens. Animals were maintained and bred 
in the animal facility at Dana Farber Cancer Institute in compliance with the Institutional Animal Care and Use Committee. 


METHOD DETAILS 


Quantification of virus infectivity 

A549 and 293T cells expressing ACE2 and TMPRSS2 were infected with SARS-CoV-2 (Washington isolate) at an MOI of 3 for indi- 
cated times (3, 6, 12, 18, and 24 hpi). After infection, supernatants were removed, and cells were fixed with 496 paraformaldehyde for 
30 minutes at room temperature. Cells were then permeabilized with 0.1% of Triton X-100 in PBS for 10 minutes and hybridized with 
Anti-SARS-CoV Nucleocapsid (N) Protein (RABBIT) polyclonal antibody (1:2000, Rockland, #200-401-A50) at 4°C overnight. Alexa 
Fluor 568 goat anti rabbit antibody (Invitrogen, A11011) were used as the secondary antibody for labeling virus infected cells. Finally, 
DAPI was added to label the nuclei. Immunofluorescent images were taken using an EVOS microscope with 10x lens and infection 
rates were calculated with ImageJ. 


Cell 184, 3962—3980.e1-68, July 22, 2021 ei 








© CelPress Cell 


Plaque assay 

Vero E6 cells were used to determine the titer of our virus stock and to evaluate SARS-CoV-2 inactivation following lysis of infected 
cells in our HLA-IP buffer. Briefly, we seeded Vero E6 cells into a 12-well plate at a density of 2.5 x 1 0° cells per well, and the next day, 
infected them with serial 10-fold dilutions of the virus stock (for titration) or the A549 lysates (for the inactivation assay) for 1h at 37°C. 
We then added 1 mL per well of the overlay medium containing 2X DMEM (GIBCO: #12800017) supplemented with 4% FBS and 
mixed ata 1:1 ratio with 1.2% Avicel (DuPont; RC-581) to obtain the final concentrations of 2% and 0.6% for FBS and Avicel, respec- 
tively. Three days later, we removed the overlay medium, rinsed the cell monolayer with 1X PBS and fixed the cells with 4% para- 
formaldehyde for 30 minutes at room temperature. 0.1% crystal violet was used to visualize the plaques. 


Immunoprecipitation of HLA-I complexes 

Cells engineered to express SARS-CoV-2 entry factors were seeded into nine 15 cm dishes (three dishes per time point) at a density of 
15 million cells per dish for A549 cells and 20 million cells per dish for HEK293T cells. The next day, the cells were infected with SARS- 
CoV-2 ata multiplicity of infection (MOI) of 3. To synchronize infection, the virus was bound to target cells in a small volume of opti-MEM 
on ice for one hour, followed by addition of DMEM/2% FBS and switching to 37°C. At 3, 6, 12, 18, and 24h post-infection, the cells from 
three dishes were scraped into 2.5ml/dish of cold lysis buffer (0mM Tris, pH 8.0, 100mM NaCl, 6mM MgCl2, 1mM EDTA, 60mM Octyl 
B-d-glucopyranoside, 0.2mM lodoacetamide, 1.5% Triton X-100, 50xCOmplete Protease Inhibitor Tablet-EDTA free and PMSF) ob- 
taining atotal of 9 mL lysate. This lysate was split into 6 eppendorf tubes, with each tube receiving 1.5 mL volume, and incubated on ice 
for 15 min with 1ul of Benzonase (Thomas Scientific, E1014-25KU) to degrade nucleic acid. The lysates were then centrifuged at 
4,000 rpm for 22min at 4°C and the supernatants were transferred to another set of six eppendorf tubes containing a mixture of 
pre-washed beads (Millipore Sigma, GE17-0886-01) and 50 ul of an MHC class | antibody (W6/32) (Santa Cruz Biotechnology, sc- 
32235). The immune complexes were captured on the beads by incubating on a rotor at 4°C for 3hr in the BSL3 lab. Virus inactivation 
was confirmed before subsequent samples processing outside the BSL3 using plaque assay (Figure S1C). The unbound lysates were 
kept for whole proteomics analysis while the beads were washed to remove non-specifically bound material. In total, nine washing 
steps were performed; one wash with 1mL of cold lysis wash buffer (20mM Tris, pH 8.0, 100mM NaCl, 6mM MgCl2, 1mM EDTA, 
60mM Octyl p-d-glucopyranoside, 0.2mM lodoacetamide, 1.596 Triton X-100), four washes with 1mL of cold complete wash buffer 
(20mM Tris, pH 8.0, 100mM NaCl, 1mM EDTA, 60mM Octyl B-d-glucopyranoside, 0.2mM lodoacetamide), and four washes with 
20mM Tris pH 8.0 buffer. Dry beads were stored at —80°C until mass-spectrometry analysis was performed. 


HLA-I peptidome LC-MS/MS data generation 

HLA peptides were eluted and desalted from beads as described previously (Sarkizova et al., 2020). After the primary elution step, 
HLA peptides were reconstituted in 396 ACN/596 FA and subjected to microscaled basic reverse phase separation. Briefly, peptides 
were loaded on Stage-tips with 2 punches of SDB-XC material (Empore 3M) and eluted in three fractions with increasing concentra- 
tions of ACN (596, 1096 and 3096 in 0.196 NH4OH, pH 10). For the time course experiment, one third of a pool of 6 IPs (for 12|18|24h) or 
a pool of 2 IPs (for 3|6|24hpi) was also labeled with TMT6 (Thermo Fisher Scientific, lot # UC280588, A549: 12h:126, 3h:127, 18h:128, 
129: 6h, 24h:130, HEK293T: 3h: 126, 12h:127, 6h:128, 18h:129, 24h:131) (Thompson et al., 2003), combined and desalted on a C18 
Stage-tip, and then eluted into three fractions using basic reversed phase fractionation with increasing concentrations of ACN (1096, 
1596 and 5096) in 5 mM ammonium formate (pH 10). Peptides were reconstituted in 396 ACN/596FA prior to loading onto an analytical 
column (25-30cm, 1.9 um C18 (Dr. Maisch HPLC GmbH), packed in-house PicoF rit 75 um inner diameter, 10 um emitter (New Objec- 
tive)). Peptides were eluted with a linear gradient (EasyNanoLC 1200, Thermo Fisher Scientific) ranging from 696-3096 Solvent B 
(0.196FA in 90% ACN) over 84 min, 3096-9096 B over 9 min and held at 9096 B for 5 min at 200 nl/min. MS/MS were acquired on 
a Thermo Orbitrap Exploris 480 equipped with FAIMS (Thermo Fisher Scientific) in data dependent acquisition. FAIMS CVs were 
set to —50 and —70 with a cycle time of 1.5 s per FAIMS experiment. MS2 fill time was set to 100ms, collision energy was 29CE 
or 32CE for TMT respectively. 


Whole proteome LC-MS/MS data generation 

200 uL aliquot of HLA IP supernatants were reduced for 30 minutes with 5mM DTT (Pierce DTT: A39255) and alkylated with 10mM IAA 
(Sigma IAA: A3221-10VL) for 45 minutes both at 25°C on a shaker (1000 rpm). Protein precipitation using methanol/chloroform was 
then performed. Briefly, methanol was added at a volume of 4x that of HLA IP supernant aliquot. This was followed by a 1x volume of 
chloroform and 3x volume of water. The sample was mixed by vortexing and incubated at —20°C for 1.5 hours. Samples were then 
centrifuged at 14,000 rpm for 10 minutes and the upper liquid layer was removed leaving a protein pellet. The pellet was rinsed with 3x 
volume of methanol, vortexed lightly, and centrifuged at 14,000 rpm for 10 minutes. Supernatant was removed and discarded without 
disturbing the pellet. Pellets were resuspended in 100 mM triethylammonium bicarbonate (pH 8.5) (TEAB). Samples were digested 
with LysC (1:50) for 2h on a shaker (1000 rpm) at 25°C, followed by trypsin (1:50) overnight. Samples were acidified by 1% formic acid 
final concentration and dried. Samples were reconstituted in 4.5 mM ammonium formate (pH 10) in 296 (vol/vol) acetonitrile and sepa- 
rated into four fractions using basic reversed phase fractionation on a C-18 Stage-tip. Fractions were eluted at 596, 12.596, 1596, and 
5096 ACN/4.5 mM ammonium formate (pH 10) and dried. Fractions were reconstituted in 396ACN/596FA, and 1 ug was used for LC- 
MS/MS analysis. MS/MS were acquired on a Thermo Orbitrap Exploris 480 (Thermo Fisher Scientific) in data dependent acquisition 
(MS2 isolation width 0.7 m/z, top20 scans, collision energy 30%) (Figures 2, 3, 4, and SSB-S3D). Uninfected 1 ug single shot samples 


e4 Cell 184, 3962-3980.e1-e8, July 22, 2021 


Cell © CelPress 


were analyzed similarly. For the time course experiment, the samples (12h, 18h, 24h) were not fractionated and 1 ug was used for LC- 
MS/MS analysis, as described above except that FAIMS with —50, —65, and —85 CV was applied and cycle time was 0.8 s for each 
CV (Figure S3A). 


LC-MS/MS data interpretation 

Peptide sequences were interpreted from MS/MS spectra using Spectrum Mill (v 7.1 pre-release) to search against a RefSeq-based 
sequence database containing 41,457 proteins mapped to the human reference genome (hg38) obtained via the UCSC Table 
Browser (https://genome.ucsc.edu/cgi-bin/hgTables) on June 29, 2018, with the addition of 13 proteins encoded in the human mito- 
chondrial genome, 264 common laboratory contaminant proteins, 553 human non-canonical small open reading frames, 28 SARS- 
CoV2 proteins obtained from RefSeq derived from the original Wuhan-Hu-1 China isolate NC_045512.2 (https://www.ncbi.nlm.nih. 
gov/nuccore/1798174254; Wu et al., 2020), and 23 novel unannotated virus ORFs whose translation is supported by Ribo-seq (Finke! 
et al., 2020b) for a total of 42,337 proteins. Among the 28 annotated SARS-CoV2 proteins we opted to omit the full-length polypro- 
teins ORF1a and ORF 1ab, to simplify peptide-to-protein assignment, and instead represented ORF 1ab as the mature 16 individual 
non-structural proteins that result from proteolytic processing of the 1a and 1ab polyproteins. We added the D614G variant of the 
SARS-Cov2 Spike protein that is commonly observed in European and American virus isolates. For additional searches, we also 
added 2036 entries from 6-frame translation of the SARS-Cov2 genome for all possible ORFs longer than 6 amino acids (Table S1). 

For immunopeptidome data MS/MS spectra were excluded from searching if they did not have a precursor MH- in the range of 
600-4000, had a precursor charge » 5, or had a minimum of « 5 detected peaks. Merging of similar spectra with the same precursor 
m/z acquired in the same chromatographic peak was disabled. Prior to searches, all MS/MS spectra had to pass the spectral quality 
filter with a sequence tag length » 1 (i.e., minimum of 3 masses separated by the in-chain masses of 2 amino acids) based on HLA v3 
peak detection. MS/MS search parameters included: ESI-QEXACTIVE-HCD-HLA-v3 scoring parameters; no-enzyme specificity; 
fixed modification: carbamidomethylation of cysteine; variable modifications: cysteinylation of cysteine, oxidation of methionine, 
deamidation of asparagine, acetylation of protein N-termini, and pyroglutamic acid at peptide N-terminal glutamine; precursor 
mass shift range of —18 to 81 Da; precursor mass tolerance of + 10 ppm; product mass tolerance of + 10 ppm, and a minimum 
matched peak intensity of 3096. Peptide spectrum matches (PSMs) for individual spectra were automatically designated as confi- 
dently assigned using the Spectrum Mill auto-validation module to apply target-decoy based FDR estimation at the PSM level 
of « 1.596 FDR. For the TMT-labeled time course experiments, two parameters were revised: the MH- range filter was 800-6000, 
and TMT labeling was required at lysine, but peptide N-termini were allowed to be either labeled or unlabeled. Relative abundances 
of peptides in the time-course experiments were determined in Spectrum Mill using TMT reporter ion intensity ratios from each PSM. 
TMT reporter ion intensities for the 3 time points split across two plexes were not corrected for isotopic impurities because the 
respective adjacent intervening labels were not included. Each peptide-level TMT ratio was calculated as the median of all PSMs 
contributing to that peptide. PSMs were excluded from the calculation that lacked a TMT label, or had a negative delta forward- 
reverse identification score (half of all false-positive identifications). Intensity values for each time point were normalized to the 
24h time point to compare between the 12|18|24h and 3|6[24h plex. 

For whole proteome data MS/MS spectra were excluded from searching if they did not have a precursor MH+ in the range of 
600-6000, had a precursor charge » 5, had a minimum of « 5 detected peaks, or failed the spectral quality filter with a sequence 
tag length » O (i.e., minimum of 2 masses separated by the in-chain masses of 1 amino acid) based on ESI-QEXACTIVE-HCD-v4- 
30-20 peak detection. Similar spectra with the same precursor m/z acquired in the same chromatographic peak were merged. 
MS/MS search parameters included: ESI-QEXACTIVE-HCD-v4-30-20 scoring parameters; Trypsin allow P specificity with a 
maximum of 4 missed cleavages; fixed modification: carbamidomethylation of cysteine and seleno-cysteine; variable modifications: 
oxidation of methionine, deamidation of asparagine, acetylation of protein N-termini, pyroglutamic acid at peptide N-terminal gluta- 
mine, and pyro-carbamidomethylation at peptide N-terminal cysteine; precursor mass shift range of —18 to 64 Da; precursor mass 
tolerance of + 20 ppm; product mass tolerance of + 20 ppm, and a minimum matched peak intensity of 3096. Peptide spectrum 
matches (PSMs) for individual spectra were automatically designated as confidently assigned using the Spectrum Mill auto-valida- 
tion module to apply target-decoy based FDR estimation at the PSM level of « 1.096 FDR. Protein level data was summarized by top 
uses shared (SGT) peptide grouping and non-human contaminants were removed. SARS-CoV-2 derived proteins were manually 
filtered to include identifications with » 696 sequence coverage and at least 2 or more unique peptides. 


Validation of peptide identifications 

Peptide identifications were validated using synthetic peptides. Synthetic peptides were obtained from Genscript, at purity » 9096 
purity and dissolved to 10 mM in DMSO. For LC-MS/MS measurements, peptides were pooled and further diluted with 0.1% FA/396 
ACN to load 120 fmol/ul on column. One aliquot of synthetic peptides was also TMT labeled as described above. LC-MS/MS mea- 
surements were performed as described above. For plots, peak intensities in the experimental and the synthetic spectrum were 
normalized to the highest peak. 


RNA-Seq of SARS-CoV-2 infected cells 


A549 and HEK293T cells were seeded into 6-well plates at a density of 5 x 10? cells per well (one well per condition). After 11-24h, the 
cells were infected with SARS-CoV-2 at an MOI of 3. At 12, 18 and 24h post-infection, the cells were lysed in Trizol (Thermo, 


Cell 184, 3962—3980.e1-e8, July 22, 2021 e5 








© CelPress Cell 


15596026), and the total RNA was isolated using standard phenol chloroform extraction. Standard Illumina TruSeq Stranded mRNA 
(LT) was performed using 500 ng of total RNA (illumina, FC-122-2101). Oligo-dT beads were used to capture polyA-tailed RNA, fol- 
lowed by fragmentation and priming of the captured RNA (8 minutes at 94°C). Immediately first strand cDNA synthesis was per- 
formed. Second strand cDNA synthesis was performed using second strand marking master and DNA polymerase 1 and RNase 
H cDNA was adenylated at the 3’ ends followed immediately by RNA end ligation single-index adapters (AROO1-AR012). Library 
amplification was performed for 12-15 cycles under standard illumina library PCR conditions. Library quantitation was performed 
using Agilent 2200 TapeStation D1000 ScreenTape (Agilent, 5067-5582). RNA sequencing was performed on the NextSeq 550 Sys- 
tem using a NextSeq V2.5 High Output 75 cycle kit (illumina, 20024906) or 150 cycles kit (illumina, 20024907) for paired-end 
sequencing (7Ont of each end). 


In-vitro MHC-peptide binding assay 

Classical competition assays, based on the inhibition of binding of a high affinity radiolabeled ligand to purified MHC molecules, were 
utilized to quantitatively measure peptide binding to HLA-A and -B class | MHC molecules. The assays were performed, and MHC 
purified, as detailed previously (Sidney et al., 2013). Briefly, 0.1-1 nM of radiolabeled peptide was co-incubated at room temperature 
with 1 uM to 1 nM of purified MHC in the presence of a cocktail of protease inhibitors and 1 uM B2-microglobulin. MHC bound radio- 
activity was determined following a two-day incubation by capturing MHC/peptide complexes on W6/32 (anti-class I) antibody 
coated Lumitrac 600 plates (Greiner Bio-one, Frickenhausen, Germany), and measuring bound cpm on the TopCount (Packard In- 
strument Co., Meriden, CT) microscintillation counter. The concentration of peptide yielding 50% inhibition of the binding of the radio- 
labeled peptide was calculated. Under the conditions utilized, where [label] < [MHC] and ICs55 > [MHC], the measured ICs, values are 
reasonable approximations of the true Kd values (Cheng and Prusoff, 1973; Gulukota et al., 1997). Each competitor peptide was 
tested at six different concentrations covering a 100,000-fold dose range, and in three or more independent experiments. As a pos- 
itive control, the unlabeled version of the radiolabeled probe was also tested in each experiment. 


Vaccinations of HLA-A2 transgenic mice 

Five mice were immunized subcutaneously in the flank with a vaccine. The vaccine contained nine A*02:01 peptides (50ug each pep- 
tide per mice) emulsified in Complete Freunds Adjuvant (CFA BD Bioscience/Difco) supplemented with 20ug PolyIC/LC (Hiltonol/On- 
covir). 10 days post-vaccination, animals were euthanized using CO2, and Spleens were removed for ELISpot assays. 


Mouse ELISpot assay 

ELISpot was performed using red blood cell-depleted mouse splenocytes (200,000 cells/well) co-incubated with the individual pep- 
tides (10 ug/ml) in triplicate in ELISpot plates (Millipore, Billerica, MA) for 18h. Interferon-y (IFNy) secretion was detected using cap- 
ture and detection antibodies as described (Mabtech AB, Nacka Strand, Sweden) and imaged using an ImmunoSpot Series Analyzer 
(Cellular Technology, Ltd, Cleveland, OH). HLA-A*02:01 restricted HIV-GAG peptide and non-stimulated wells were used as negative 
controls. Spot numbers were normalized by removing the average background spot numbers calculated from negative control wells. 
AntiCD3 (2C11 BD BioScience) and PHA was used as a positive control. 55 spot-forming units/10 cells and a > 3-fold increase over 
baseline is used as a threshold for positive responses. Methods were described in detail previously (Keskin et al., 2015). 


ELISpot assay with COVID-19 PBMCs 

Peripheral blood samples were collected from COVID-19 convalescent patients and PBMC were isolated using ficoll density gradient 
centrifugation. PBMC were plated out in serum free T cell assay media at 2.5e5 cells per well in a Human IFNy single color ELISpot plate 
(Cellular Technology Limited [CTL], Cat# hIFNgp-2M/2). The canonical and non canonical pools (canonical pool: APHGHVMVEL, 
EEFEPSTQYEY, EIKESVQTF, EILDITPCSF, EILDITPCSFG, FASEAARVV, FAVDAAKAY, IRQEEVQEL, KNIDGYFKIY, KRVDWTIEY, 
NATNVVIKV, QLTPTWRVY, SEFSSLPSY, VGYLQPRTF, and YLNSTNVTI; non-canonical pool: DEFVVVTV, ELPDEFVVVTV, 
GLITLSYHL, GPMVLRGLIT, LEDKAFQL, MLLGSMLYM, and SLEDKAFQL) and a commercial nucleocapsid overlapping peptide 
pool (JPT peptide Technologies) were added to duplicate wells at a concentration of 2ug/ml of each peptide. A negative control well 
(to which just the equivalent concentration of DMSO was added) was used to assess background IFNg secretion. Cells were incubated 
for 16-20 hours at 370C before developing according to manufacturer’s instructions. Spots were counted using an ImmunoSpot 
CoreS6 ELISpot counter (ImmunoSpot). The negative control background was subtracted from the antigen wells and the results are 
shown as spot forming units (SFU) per 2.5e5 PBMC. A spot cut off of 8 after background subtraction is used here to denote a positive 
response. 


Multiplexed tetramer assay 

HLA-A*02:01, and HLA-B*07:02 extracellular domains were expressed in E. coli and refolded along with beta-2-microglobulin and UV- 
labile place-holder peptides KILGFVFJV, and AARGJTLAM, respectively (Altman and Davis, 2016). The MHC monomer was then pu- 
rified by size exclusion chromatography (SEC). MHC tetramers were produced by mixing alkylated MHC monomers and azidylated 
streptavidin in 0.5 mM copper sulfate, 2.5 mM BTTAA and 5 mM ascorbic acid for up to 4 h on ice, followed by purification of highly 
multimeric fractions by SEC. Individual peptide exchange reactions containing 500 nM MHC tetramer and 60 uM peptide were exposed 
to long-wave UV (366 nm) at a distance of 2-5 cm for 30 min at 4°C, followed by 30 min incubation at 30°C. A biotinylated oligonucleotide 


e6 Cell 184, 3962-3980.e1-e8, July 22, 2021 


Cell © CelPress 


barcode (Integrated DNA Technologies) was added to each individual reaction followed by 30 minute incubation at 4°C. Individual 
tetramer reactions were then pooled and concentrated using 30 kDa molecular weight cut-off centrifugal filter units (Amicon). 

De-identified peripheral blood mononuclear cells (PBMCs) from convalescent COVID-19 positive donors or unexposed donors 
were purchased from Precision 4 Medicine (USA), Sanguine (USA), or CTL (USA). These companies ethically collected samples under 
informed consent or as clinical excess specimens under a waiver of consent. PBMCs were thawed, and CD8+ T cells were enriched 
by magnetic-activated cell sorting (MACS) using a CD8+ T Cell Isolation Kit (Miltenyi) following the manufacturer’s protocol. The 
CD8+ T cells were then stained with 1 nM final concentration tetramer library in the presence of 2 mg/mL salmon sperm DNA in 
PBS with 0.596 BSA solution for 20 minutes. Cells were then labeled with anti-TCR ADT (IP26, Biolegend) for 15 minutes followed 
by washing. Tetramer bound cells were then labeled with PE conjugated anti-DKDDDDK-Flag antibody (BioLegend) followed by 
dead cell discrimination using 7-amino-actinomycin D (7-AAD). The live, tetramer positive cells were sorted using a Sony MA900 
Sorter (Sony). 


Single-cell Sequencing 

Tetramer positive cells were counted by Nexcelom Cellometer (Lawrence, MA, USA) using AOPI stain following manufacturer's rec- 
ommended conditions. Single-cell encapsulations were generated utilizing 5’ v1 Gem beads from 10x Genomics (Pleasanton, CA, 
USA) on a 10x Chromium controller and downstream TCR, and Surface marker libraries were made following manufacturer recom- 
mended conditions. All libraries were quantified on a BioRad CFX 384 (Hercules, CA, USA) using Kapa Biosystems (Wilmington, MA, 
USA) library quantified kits and pooled at an equimolar ratio. TCRs, surface markers, and tetramer generated libraries were 
sequenced on Illumina (San Diego, CA, USA) NextSeq550 instruments. 


HLA-l antigen presentation prediction 
HLAthena, a prediction tool trained on endogenous LC-MS/MS-identified epitope data, was used to predict HLA class | presentation 
for all unique 8-11-mer SARS-Cov-2 peptides across 31 HLA-A, 40 HLA-B and 21 HLA-C alleles (Sarkizova et al., 2020). 


HLA allele frequencies and coverage estimates 

World frequencies of HLA-A, -B, and -C allele in Table S6B are based on a meta-analysis of high-resolution HLA allele frequency data 
describing 497 population samples representing approximately 66,800 individuals from throughout the world (Solberg et al., 2008), 
downloaded from http://pypop.org/popdata/2008/byfreq-A.php.html, http://pypop.org/popdata/2008/byfreq-B.php.html, http:// 
pypop.org/popdata/2008/byfreq-C.php. Subpopulation frequencies for AFA, API, EUR, HIS, and USA were obtained from supple- 
mental data in Poran et al. (2020). 


The cumulative phenotypic frequency (CPF) of peptides was calculated using CPF = 1— (neet J; assuming Hardy- 
Weinberg proportions for the HLA genotypes (Dawson et al., 2001), where p; is the population frequency of the i" alleles 
within a subset of HLA-A, -B, or C alleles, denoted C. Coverage across HLA-A, -B, and -C alleles was calculated similarly: CPF = 


1— JL sr AC aD rate yi e BPI (af Ze KR) J^, where A, B, and C denote a subset of HLA-A, -B, and/or -C alleles for which 
the coverage is computed, as recently done in (Poran et al., 2020). 


QUANTIFICATION AND STATISTICAL ANALYSIS 


Whole proteome analysis 

Postfiltering, intensity-based absolute quantification (IBAQ) was performed on the whole proteome LC-MS/MS as described in 
Schwanhausser et al. (2011). Briefly, (BAC values were calculated as follows: log10(totallntensity/numObservableTrypticPeptides), 
the total precursor ion intensity for each protein was calculated in Spectrum Mill as the sum of the precursor ion chromatographic 
peak areas (in MS1 spectra) for each precursor ion with a peptide spectrum match (MS/MS spectrum) to the protein, and the numOb- 
servableTrypticPeptides for each protein was calculated using the Spectrum Mill Protein Database utility as the number of tryptic 
peptides with length 8 - 40 amino acids, with no missed cleavages allowed. Of note, S coverage was 5596 in the HEK293T and 
44% in the A549 post 24 hour fractionated data, which may be due to the high levels of glycosylation. Lower peptide coverage 
may lead to underestimation of S protein in our data. Both log10 transformed total intensity and iBAQ values were median normalized 
by subtracting sample specific medians and adding global medians for each abundance metrics and reported in Table S4. 

To evaluate post infection protein level changes for a large set of proteins across cell lines and time points (Figures 4E and 4F; Fig- 
ure S3E), all proteins detected in at least 6 out of 8 samples (A549 and HEK293T, each profiled at the 0, 3, 6, and 24 hpi) were retained 
in the analysis, with missing values imputed to the lowest 10th percentile of the log10iBAQ value distribution (Tyanova et al., 2016). A 
similar approach was used for the reanalysis of PXD020019 with values of proteins observed in eight of the twelve samples consid- 
ered for imputation, and zero values replaced with a value below the minimum LFQ value reported. 


Infected cells RNA-seq reads alignment 


Sequencing reads were mapped to SARS-CoV-2 genome (RefSeq NC 045512.2) and human transcriptome (Gencode v32). 
Alignment was performed using Bowtie version 1.2.2 (Langmead et al., 2009) with a maximum of two mismatches per read. 


Cell 184, 3962—3980.e1-68, July 22, 2021 e7 








© CelPress Cell 


The fraction of human and viral reads was determined based on the total number of reads aligned to either SARS-CoV-2 or 
human transcripts. 


Scoring pMHC-TCR interactions 

Tetramer data analysis was performed using Python 3.7.3. For each single-cell encapsulation, tetramer UMI counts (columns) were 
matrixed by cell (rows) and log-transformed. The matrix was then Z-score transformed row-wise and subsequently, median-centered 
by column. Means were calculated by clonotype, and those with a value greater than 4 were characterized as positive interactions. 


T cells transcriptomics analysis 

Hydrogel-based RNA-seq data were analyzed using the Cell Ranger package from 10X Genomics (v3.1.0) with the GRCh38 human 
expression reference (v3.0.0). Except where noted, Scanpy (v1.6.0; Wolf et al., 2018) was used to perform the subsequent single-cell 
analyses. Any exogenous control cells identified by TCR clonotype were removed before further gene expression processing. Hydro- 
gels that contain UMISs for less than 300 genes were excluded. Genes that were detected in less than 3 cells were also excluded from 
further analysis. The following additional quality control thresholds were also enforced. To remove data generated from cells likely to 
be damaged, upper thresholds were set for percent UMIs arising from mitochondrial genes (13%). To exclude data likely arising from 
multiple cells captured in a single drop, upper thresholds were set for total UMI counts based on individual distributions from each 
encapsulation (from 1500 to 3000 UMIs). A lower threshold of 10% was set for UMIs arising from ribosomal protein genes. Finally, an 
upper threshold of 596 of UMIs was set for the MALAT1 gene. Any hydrogel outside of any of the thresholds was omitted from further 
analysis. A total of 15,683 hydrogels were carried forward. Gene expression data were normalized to counts per 10,000 UMIs per cell 
(CP10K) followed by log1p transformation: In(CP10K + 1). 


Clustering and Annotation of single cell data 

Highly variable genes were identified (1,567) and scaled to have a mean of zero and unit variance. They were then provided to sca- 
norama (v1.7; Hie et al., 2019) to perform batch integration and dimension reduction. These data were used to generate the nearest 
neighbor graph which was in turn used to generate a UMAP representation that was used for Leiden clustering. The hydrogel data 
(not scaled to mean zero, unit variance, and before extraction of highly variable genes) were labeled with cluster membership and 
provided to SingleR (v1.4.0; Aran et al., 2019) using the following references from Celldex (v1.0.0; Aran et al., 2019): Monacolmmu- 
neData, DatabaselmmuneCellExpressionData, and BlueprintEncodeData. SingleR was used to annotate the clusters with their best- 
fit match from the cell types in the references. Clusters that yielded cell types other than types of the T Cell lineage were removed from 
consideration and the process was repeated starting from the batch integration step. The best-fit annotations from SingleR after the 
second round of clustering and annotation were assigned as putative labels for each Leiden cluster. 

In order to provide corroboration for the SingleR best-fit annotations and further evidence as to the phenotype of the clusters, gene 
panels representing functional categories (Naive, Effector, Memory, Exhaustion, Proliferation) were used to score each hydrogel's 
expression profiles using scanpy's "score genes" function (Wolf et al., 2018) which compares the mean expression values of the 
target gene set against a larger set of randomly chosen genes that represent background expression levels. The gene panels for 
each class were (Su et al., 2020): Naive - TCF7, LEF1, CCR7; Effector - GZMB, PRF1, GNLY; Memory - AQP3, CD69, GZMK; Exhaus- 
tion - PDCD1, TIGIT, LAG3, HAVCR2 (TIM3); Proliferation - MKI67, TYMS. The gene expression matrix for all hydrogels were first 
imputed using the MAGIC algorithm (v2.0.4; van Dijk et al., 2018). These functional scores were the only data generated from imputed 
expression values. 


e8 Cell 184, 3962-3980.e1-e8, July 22, 2021 


Cell © CelPress 


OPEN ACCESS 





A549/ACE2/TMPRSS2 HEK293T/ACE2/TMPRSS2 


DAPI Nucleocapsid 293T 


DAPI Nucleocapsid o A549 100 





80 80 

















E 60 5 60 

Ge I 

0 OU 

E E 

x 40 x 40 
20 20 
? 3hpi 6hpi 12 hpi 18 hpi 24 hpi ? 3hpi 6 hpi 12 hpi 18 hpi 24 hpi 

C 
A549 + SARS-CoV-2 A549 + SARS-CoV-2 A549 
Supernatant + Lysis buffer + Lysis buffer 


Replicate 1 Replicate 2 


1:102 1:101 


1:10? 





Figure S1. SARS-CoV-2 infection of HEK293T/ACE2/TMPRSS2 and A549/ACE2/TMPRSSz2, related to Figure 1 

(A) A549 cells expressing ACE2 and TMPRSS2 were infected with SARS-CoV-2 at MOI of 3 for 3, 6, 12, 18, and 24 hours. Fixed cells were incubated with a 
fluorescence antibody to the nucleocapsid and DAPI stain was used to label the nuclei. Immunofluorescent images were taken using an EVOS microscope with a 
10x lens. Bars show mean + SD (B) Similar to (A) for HEK293T cells. (C) Plaque assay confirming SARS-CoV-2 inactivation for HLA-IP experiments. A549 cells 
were infected with SARS-CoV-2 at MOI of 3 for 24 hours. 10-fold serial dilutions were prepared in Opti-MEM and used to infect Vero cells in a 24-wells plate. 
Comparing plaques in (left) cultured media of infected A549 cells; (middle) SARS-CoV-2 infected A549 cells treated with a lysis buffer containing 1.596 Triton-X 
and Benzonase for 3 hours; and (right) non-infected A549 cells. When adding the 1:10 dilution of the lysis buffer, infected and non-infected cells died immediately 
due to the relatively high Triton-X concentration. 


© Cellress Cell 


OPEN ACCESS 
























































































































































































































































































































A A*25:01 A*25:01 A*30:01 A*30:01 A*02:01 A*02:01 
A549 A549 + SARS-CoV-2 A549 A549 + SARS-CoV-2 3 HEK293T HEK293T + SARS-CoV-2 
3 n=997 n=918 3 n=218 n=178 n=1125 n=451 
2 l | 2 | DW 
í a F : I 
abt: e ay D = M ey 
0 === 2 zm = =A XA 
123456789 1 123456789 123456789 2 
B*18:01 B*18:01 B*44:03 B*44:03 B*07:02 B*07:02 Chemistry 
A549 A549 + SARS-CoV-2 A549 A549 + SARS-CoV-2 HEK293T HEK293T + SARS-CoV-2 B acci 
3 P n-59 n=113 3 | n-220 n-310 3 [| n=495 n-249 one 
o " |] Basic 
m ? S L L Bl Hydrophobic 
1 Y Y 1 A A 
D sU D y S i S B 1 Bl Neutral 
0 Ep SSES, = == = BS Sey === ———Yy 0 = =X: x EN Polar 
123456789 12 9 123456789 123456789 123456789 
C*16:01 C*16:01 C*12:03 6*12:03 C*07:02 C*07 02 
3 A549 A549 + SARS-CoV-2 A549 A549 + SARS-CoV-2 3 HEK293T HEK293T + SARS-CoV-2 
n=273 n=260 3 n=157 n=19 
2 
2 
E Gs b 0 ke sex =K= sc 
123456789 123456789 123456789 9 
B | | 
25 A549 3hpi A549 6hpi A549 12 hpi A549 18 hpi A549 24hpi R2 
1.5 F 
10 e $ e _t X dë a v 
Eege 123456789 123456789 123456789 123456789 
o | 
HEK293T 3hpi HEK293T 6hpi HEK293T 12hpi HEK293T 18 hpi HEK293T 24 hpi R2 
2.0 
f V V V V V 
i. L L E aC E L ak EB 
os £56. ES m =F axs PB = x Se B ; = Sy E = um #3 OE x sie 
090 123456789 123456789 23456789 123456789 123456789 
B 1.004 1.004 
D 
E 
D 
"CH 
e 
B 0.754 
Wa 
Wa 
D 
[^P] 
[0] 
o 
T 0.504 
a 
"CH 
(eb) 
E 
[0] 
E 
Q 0.254 ! 
o 
C 
.Q 
© 
© 
LL T T T T T T 0.004 T T T T T T T 
24hpi 3hpi 6hpi 12hpi ^ 18hpi ^ 24hpiR2 Ohpi 24hpi 3hpi 6hpi 12hpi ^ 18hpi  24hpiR2 
Wrion RBC P Multi vB unassigned Eo! = co7o2 Bi vut c 
J £40 cen J wvurc Won multi vB unassigned 
D E HEK293T 
log2p1(TPM) 
HLA-A 8 
7 
HLA-B 6 
5 
HLA-C 4 
Sars-Cov-2 z + - + 0.05 0.10 0.15 0.20 0.25 
A549 HEK293T fraction of human 9-mers with predicted score better than 50% of known binders 


(legend on next page) 


Cell © CelPress 


OPEN ACCESS 





Figure S2. Peptide logos and allele assignment for all experiments, related to Figure 1 

(A) Logo plots for individual alleles of peptides identified and assigned to cell line specific alleles with HLAthena percentile rank < 0.5 for naive and 24h post Sars- 
CoV-2 infected A549 (left) and HEK293 (right) cells. (B) Peptide logo plots aggregated over all alleles for label free time course experiments in A549 and HEK293 
samples. (C) Allele assignment for peptides identified in time course experiments using HLAthena with a percentile rank < 0.5 cutoff. (D) Expression level of 
HLA-A, D and -C alleles as measured by RNA-seq in A549 and HEK293T cell lines pre- and 24hr post-infection. (E) All 9-mer peptides tiled along human protein 
sequences were predicted for binding to the HLA alleles present in HEK293T (top) and A549 (bottom) cell lines. Per allele, the fraction of those 9-mers with 
predicted binding scores that are better than 50% of previously identified known binders in mono-allelic experiments for the allele is shown. 





© CelPress Cell 


OPEN ACCESS 


A B HEK293T 24hpi SARS-CoV-2: Full Proteome Protein Ranks 
ke Wilcoxon rank sum test:W = 42971, p-value = 4.477e-06 
Ihpi|96 SARS_CoV_2 "n 4 
A549 |o] 0.009 I GO 
HEK293T | 0 | 0.0096 i| en onrse Ën 
A549 0.01% g y 
HEK293T 0.0196 E vp ORIS 
A549 | 6 | 0.89 Jm eL m 
HEK293T | 6 | 0.4896 5 Kat 
A549 2.10% 8 dam? 
HEK293T 2.20% : "E 
A549 2.2296 : Whole Proteome 
HEK293T 2.6496 e: 


A549 3.0596 
HEK293T 2.5896 





0 2000 4000 6000 8000 
Protein Rank 


A 
= 
c 
c 


Proteasome 


C HEK293T 24hpi SARS-CoV-2: HLA- Peptide Ranks D ^g processing 


Wilcoxon rank sum test:W = 4513, p-value = 0.4707 


Species 


Human 
€ SARS canonical 
28 € SARS non-canonical 


9 + SARS-CoV-2 


K 
AER + SARS-CoV-2 
- HEK + SARS-CoV-2 
R * SARS-CoV-2 


NÉR * SARS-CoV-2 
- HEK + SARS-CoV-2 


— HEK * SARS-CoV-2 

fez + SARS-CoV-2 
- HEK + SARS-CoV-2 
— HEK + SARS-CoV-2 
— HEK + SARS-CoV-2 


— A549 + SARS-CoV-2 
EK 


- A549 
-AEI 

- A549 
- A549 
-H 

- ADA 
- DÉI 


SLEDKAFQL: ORF9b e 


GPMVLRGLIT: ORFS.iORF1 A 
24 e 


ELPDEFVVVTV: ORF9b Kä 


log2 Peptide Intensity 


APHGHVMVEL: nsp1 


e 
ei s GLITLSYHL: ORFS.iORF1 HLA-I immunopeptidome 


KRVDWTIEY: nsp14 








16 


0 500 1000 
Peptide Rank 


E PXD020119: A549 6hpi SARS-CoV-2 


«4190719 4=5371 proteins S ^6 6 


RA 





SARS 

10° Ubiquitination 
Proteasome 
Ag processing 


IFN 


p-value 


104 


107? 


log, (BACH 





-1 0 1 2 3 
log10(SARS iBAQ/Uninfected BAC) 


- ZSWIM2 





Figure S3. SARS-CoV-2 peptide abundance and antigen presentation pathway proteins in infected cells, related to Figure 4 

(A) Table showing the percentage of the total whole proteome abundance represented by SARS-CoV-2 derived proteins at 0, 3, 6, 12, 18, 24hpi identified in 
singleshot whole proteome LC-MS/MS analyses. (B) Rank plot of the protein abundances represented by log10 protein iBAQ values for each human (gray), 
canonical SARS-CoV-2 (blue), and noncanonical SARS-CoV-2 proteins detected in the whole proteome analysis of HEK293T cells 24hpi. SARS-CoV2 proteins 
are annotated with their respective gene names. (C) Similar rank plot to (A) but for observed HLA-I peptides and their abundances represented by log2 peptide 


(legend continued on next page) 


Cell © CelPress 


OPEN ACCESS 





intensities in HEK293T cells 24hpi. Peptides mapping to SARS-CoV-2 are annotated with their respective amino acid sequence and source protein name. (D) 
Heatmap of log10 BAC values for antigen presentation pathway proteins observed across uninfected and 24hpi in A549 and HEK293T cells. (E) Volcano plot 
comparing protein levels across uninfected and infected A549/ACE2 cells 6hpi reported in publically available whole proteome data (PXD02001 9). Proteins from 
SARS-CoV-2 (red), ubiquitination pathways (teal), proteasomal function (purple), antigen processing (pink), and IFN pathways (orange) are colored accordingly. 
Significantly changing proteins are shown above the dashed line (p value « 0.01) along with annotations of specific proteins involved in the above pathways. 


© CelPress Cell 


OPEN ACCESS 





A B 


8 
A*02:01-ELPDEFVVVTV (323 cells, 5 subjects) 
p < 1.74e-06 e e e 

e eo 

6 e e e 
e e 

o d e 

5 e e á 
e e ? e e 
4 e © Subject 
e © Q e 3 
: e SP ò o 
o 
o e 9 
o e 
z e e @ 
" e 
e^. 
E e$ d. e e 
0 ore 
e e 
ec 


+ - 
HLA-A*02:01 binding 


N 


Number of CD8+ clones (log2) 
Seege 
—A UD Phi 


LA 





C Alpha size distribution Beta size distribution D 
25 
100 
m 20 
u 2 0 
G 9 0 2 4 6 8 10 12 
2 5 15 
5 10 S 
2 2 
£ S 10 
di = 
5 
0 = as : Q 
0 2 4 6 8 10 12 14 16 
cdr3 length cdr3 length 


Reactivity "E 


1.6 7 


EBV |. BD NM RPPIFIRRL 
kä 
SARS-CoV-2 H BE GE srawen | ‘ p < 3e-10 
N nsp1 B APHGHVMVEL Kou N 
x nsp2 J LATNNLVVM 0.0 3 " 
O NE APRITFGGP um 
co S.iORF1/2 J GPMVLRGLIT = 
EIKESVQTF = 3 
nsp2 | FASEAARVV 2 
STSAFVETV e 3 
EEFEPSTQYEY 5 
nsp3 | FGDDTVIEV S 2 
YLNSTNVTI Zeg 
nsp8 B SEFSSLPSY z 
nsp10 Bi FAVDAAKAY Q 1 
nsp14 E KRVDWTIEY E 
Mg VATSRTLSY = 0 i 
S ORF7a J IRQEEVQEL 
N EILDITPCSF 
3 EILDITPCSFG -1 T e 
c HADQLTPTW HLA-B*07:02 binding 
2 S KNIDGYFKIY 
NATNVVIKV 
QLTPTWRVY 
VGYLQPRTF 
DEFVVVTV 
ELPDEFVVV 
ORF9b ELPDEFVVVTV 
KAFQLTPIAV 
LEDKAFQL 
SLEDKAFQL 
GLITLSYHL 
S.jORF1/2 | MLLGSMLYM 


PC360 
PC628 
PC315 
PC370 
PC627 
PC419 
PC472 
PC470 


COVID-19 unexposed 
convalescent 


(legend on next page) 


Cell © CelPress 


OPEN ACCESS 





Figure S4. CD8+ responses to HLA-I peptides in individuals with COVID-19 and TCR homology in ELPDEFVVVTV-reactive T cells, related to 
Figure 6 

(A) The number of unique CD8+ T cell clones reacting to HLA-I peptides that were found to bind HLA-A*02:01 in biochemical binding measurement and HLA-I 
peptides that did not bind HLA-A*02:01. Wilcoxon rank-sum p value is indicated. The box shows the quartiles, bar indicates median and the whiskers show the 
distribution. (B) Network plot showing the relationship of unique clonotypes within and across subjects. Clonotypes, shown as nodes, are connected to other 
clonotypes with similar alpha or beta CDR3 with edges (scirpy v0.6.0). (C) CDR3 size distributions for alpha and beta TCR chains. (D) TCR «/B-paired sequence 
logo for related clonotypes represented in the interconnected cluster at the bottom of the network shown in (B). (E) CD8- T cell reactivity detected in convalescent 
COVID-19 patients and unexposed subjects expressing B*07:02 to individual peptides that bind HLA-B*07:02 or other alleles. The score in the heatmap indicates 
the fraction of peptide-specific reacting T cells from total CD8+ T cells in the sample. (F) Similar to (A) for HLA-B*07:02. 


© CelPress Cell 


OPEN ACCESS 





Predicted presentation across 92 HLA class | alleles 
Population: API; peptides sorted by coverage 


Predicted presentation across 92 HLA class | alleles 
Population: AFA; peptides sorted by coverage 


> 
GA 






























































g Bau 155802 B5502 ih FASEAARVV (nsp2) ERE RE ZS Bann [185201 B5802 WW — t  FASEAARVV(nsp2) -J Bee 
5 Base s0801 coso NES EZ è- YLFDESGEFKL (nsp3) - ee 5 Bl M s1502 Bau: We NON | YLFDESGEFKL (nsp3) - We 
3 Ban M 81801 f coo! Lm KRVDWTIEY (nsp!4) 4 ee B B aozo [85401 [ll co102 LD ANS) - = 
© Bam [3 84901 fl cen EB ENS | SE | eg 9 Bum [81301 ES cosoa ENS d EM m | agg 
L | m— 0 H 7 B- 
S J ao202 M B5101 B cosoz ER. (nspa) S9 B a2601 F 84002 BI co401 ERES per) 
Ze Bi A202 [F B4402 Coren | a LE LP SVWSKVVKV(nsp15) | EEA = B A0z03 [B0702 egen ENS | (TAQNSVRVL (nsp2) -J Ps" 
5 [EN | I!ROEEVQEL(ORF7a) J == 5 | ELPDEFVVVTV ORFS) | gsuumumr— — 
& [Mj aseot B5201 cm 8 "` L —NANVVKV(S) d ipussnungo S B a3201 M B3503 E cosoz e — L KRVDWTIEY (nsp14) - espe 
= Lange |) B4001 [1 co702 bE L KNIDGYFKIY (S) 1 e S E Ao211 |) 81302 [H c0302 Dm | LATNNLVVM (nsp2)  —-=—= 
S I A2601 | | B1302 |. C0304 E: E^. | VATSRTLSY (M) | — S "Anne | B0801 E C0303 My | !ROEEVQEL(ORF7a) ] mur 
g [A3201 M B5001 |. coso2 Oe, | e J A2902 | B5502 |. C1502 KR GCL) | E 
l gu [ 1 pEEE—— —— 
Hamm 2705 M CO501 D A M (nspa) © |. A2501 | B1801 F C1202 | ` AFOLTPIAV (CRF9D) 
& ^ A0203 | B1517 | |C1208 EE | DEE | a Æ aam B2705 C0701 B E F MECSRTUS T (M) EE 
= A0204 | B 3 Bc e EH O TAQNSVRVL(nspg) 12::?° =.= = 8 - | - | DEFVVVIV(ORF9b) - NEN — 
o 04 B350 03 | APHGHVMVEL (nspi) | = © A0202  B5601  C1203 | YLNSTNVTI (nsp3)  - Emm 
5 A0206  B4002  C0303 e E YLNSTNVTI (nsp3) 5 si © 6802 B4402 | C0403 — E | LEDKAFQL(ORF9b) q E= 
D A0211 B5601  CO704 EB! TTTIKPVTY (nsp3) 1 EE a A0204 B5001  CO704 8 | SLEDKAFQL (ORF) q Emme 
3 A0207 B1502 C0102 | SLEDKAFQL (ORF9b) 1 Ren 2 A3601 B1517 C0501 ED EL APHGHVMVEL (nsp1) - me 
x [s0702 B1301 oam EE  CHTLSYHL (S.iORF1/2) dl = [s4001 B4501 — co202 mem L GIKESVQTF (nsp2) | gem 
83601 B4006 ^ C1202 z= | DEFVVVTV(ORF9b) 1- E BET HN Seana EE KNIDGYFKIY (S) T 
TVIEVQGY (nsp3) 1 Bez || GLITLSYHL (S.iORF1/2) | mr 
Re B0704 Ml C0801 M | KAFQLTPIAV(ORF9b) | == B1503 C1601 — | ELPDEFVVV(ORF9b) | Epe 
B4201 B3507 C0403 | EEFEPSTQYEY (nsp3) | EE ~ B3501 B0704 C1701 = | EEFEPSTQYEY (nsp3) - EMI 
I 84403 B4601 unknown | SEFSSLPSY (nsps) | ses EH 84403 B3507 unknown | | SEFSSLPSY (nsp8) 7 es 
J s4501 B5401 m | STSAFVETV (nsp2) | === [s4006 B4201 a (| TTTIKPVTY (nsp3) 5 PS 
QLTPTWRVY (S) 31: TVIEVQGY (nsp3) 4 E ~ 
M | czkesvarF(nsp2) | F EE) |  guprmPCsF(s) d me 
RS | conc d Bes B^ srsarvETV(nsp2) d me— 
threshold Et actottitvinsps) Jr B3 -  AGTDTTITV (nsp5) did 
BB %rank <= DI If NLNESLIDL (S) 1 Fees = f QLTPTWRVY (S) 4 = 
^o | ELPDEFVVV(ORF9b) JF F NLNESLIDL (S) ] m 
EE %rank <= 0.5 | — VGYLQPRIF(S) | === |  VGYLQPRTF(S) 4 | 
| EILDITPCSFG (S) 4 |  EILDITPCSFG(S) 4 
0, —- 
MB % rank <= 1.0 -  APRITFGGP(N) 4 - APRITFGGP (N) 4 
%rank <= 2.0 | MLLGSMLYM (S.iORF1) 4 | MLLGSMLYM (S.iORF1) 4 
FGPMVLRGLIT (S.iORF1/2)4 FGPMVLRGLIT (S.iORF 1/2) 
[ PepsCombined | m— — ——— [ PepsCombined | SEENEN 
1817161514131211109 876543210 0.00 0.25 0.50 0.75 1.00 1817161514131211109 87 6543210 0.00 0.25 0.50 0.75 1.00 
# Predicted HLA class | alleles, Yrank<=0.5 Estimated population coverage (AFA) # Predicted HLA class | alleles, Yrank<=0.5 Estimated population coverage (API) 
C Predicted presentation across 92 HLA class ! alleles D Predicted presentation across 92 HLA class | alleles 
Population: EUR; peptides sorted by coverage Population: HIS; peptides sorted by coverage 
9 goo e1801 55502 ENS ES H | YLFDESGEFKL (nsp3) + g (40201 [181801 B5502 L ] 
D p3) | MEE — — D EN | E YLFDESGEFKL (nsp3) - ME 
S Bann | 85101 Il coo? Lh kRvDWTIEEY(nspi4) - e: S Bj A0101 B4402 ll Co401 | t  FASEAARVV(nsp) -J3 === 
3 B 9o E s2705 oe? kx i ~~ t FASEAARVV(nspg) 1 e 3 EE E B5201 [i co702 || ` gH ` + SVVSKVKV (nsp15) | = —=— — 
9 giao: E B1302 fl coo: WW: ` EN | SVVSKVVKV ec"? 1 ees" © ll A0206 E 4901 Pë coro: i= S. RE dee | Wë 
H —] Enn SRBÉSQQQQUQZAZ L i EENEEEEEEEEE— 
S Ei aen M 83503 [f coso2 EM "s B GA S E Aen M B2705 | co304 E un. (nsp14) ] 
S gU ^  L ELPDEFVVVTV (ORF9b) J Fee S E | —?— FAVDAAKAY (nsp10) —_— 
S aso | 184901 E co501 E E" | FGDDTVIEV (nsp3) | ipsnm  — —— S fil A3002 [B1503 [i] co602 BU NP - ELPDEFVVVTV (ORF9b) J == 
2 [fj A3002 [H B4002 [i] c0304 Bo 1 o YLNSTNVTI (nsp3) d = 2 [E A3201 M B4501 Cem | RQEEVQEL(ORF7a) | Ee 
= [6802 | |B5201 | cos03 P | L FAVDAAKAY (nspíO) 4 ERES — — Damm | B5001 |. co102 BE L TAQNSVRVL(nsp2) | === 
C |.1A0208  |B5001| |C1203 P | SLEDKAFQL(ORF9b) | ees C |.1A0205 | B3503 | |C0802 ENS NATNVVIKV (S) ]l Re 
c ` c 
"G | |A0206 | B5601 [H] coso2 lg CULL SO) ^ emm “5 | A2501 — B4001 |. CO501 E NE |  LATNNLVVM (nsp2) -B= 
o | 0202 |. B4501 | co202 E^ APHGHVMVEL (nsp1) | Fee o | A0202 | B1302 [F co202 EB HU YLNSTNVTI(nsp3) J Ps 
> AN LEDKAFQL (ORF9bD) | === - E SLEDKAFQL (ORF9b) | E 
ZS A3601  B1517 Cen TAQNSVRVL | f| Ann  B1517 | C1203 | 
- ENS (nsp2) ~ o EN | DEFVVVTV (ORF9b) OO O —— 
o NE^ | B1503 | cot02 = LATNNLVVM (nsp2) {1 = @ | A0204 [ B4201 | co303 ga | CHTLSYHL (S.iORF1/2) | == 
5 A0204 B0704  C1502 EE Hg | NS == 5 A3601 B5802 ` C1502 - 5 LEDKAFQL (ORF9b) | E= 
D A0207 B4006 C0704 8 E| L TTTIKPVTY (nsp3) 1 — A A0203 B5601 | C0801 EH Bn KNIDGYFKIY (S) 7 eege 
© A0211 B5802 C1202 EN NATNVVIKV (S) | ees © A0207 B4006 C1701 EH E KAFQLTPIAV (ORF9b) - E= 
o o M 
= iso B1301. C1701 | EEFEPSTQYEY (nsp3) | === 2 E e3501 B1301 C202 | APHGHVMVEL (nspi) - == 
< SEFSSLPSY (nsp8) | A < — EEFEPSTQYEY (nsp3) | EE ~ 
Į Bo801 B1502 C0302 E 1 [B4403 B1502 C0704 E | 
— — DEFVVVTV(ORF9b) 1 EM = E| SEFSSLPSY (nsp8)  - Bes 
e B3507 C0801 ee KAFQLTPIAV (ORFS) | E B... B5401 C0302 vg TTTIKPVTY (nsp3) _ 
B3501 B4201 C0403 E TVIEVQGY (nsp3) Emmm B0702 B0704 C0403 F VATSRTLSY (M) j| ENEMM — 
[i B4001  B4601 | unknown E -— | VATSRTLSY (M) | EN [B4002 83507 ` ` unknown a — TVIEVQGY (nsp3) 1 E= 
[384403 B5401 E | EIKESVQTF (nsp2) 4j Æ [3 B0801 B4601 E EIKESVQTF (nsp2) - E 
EILDITPCSF(S) | B STSAFVETV (nsp2) | E= 
CR STSAFVETV (nsp2) 7 E | ELPDEFVVV (ORF9b) | Pë 
ES E sactortitvinsps) | E- EILDITPCSF (S) | E= 
— |  QLUPTWRVY(S) d = Bi; auprwevys) |= 
a F ELPDEFVVV(ORF9b) | Fe M+  AGTDTTITV (nsp5) i 
F NLNESLIDL (S) IQ Fees | NLNESLIDL (S) = 
F  VGYLQPRTF(S) 4 = F VGYLQPRTF(S) | == 
F EILDITPCSFG (S) 4 +  EILDITPCSFG(S) 4 
i APRITFGGP (N) 4 F | APRITFGGP(N) 4| 
F MLLGSMLYM (S.iORF1) 4 F MLLGSMLYM (S.iORF1) 4 
FGPMVLRGLIT (S.iORF1/2)4 FGPMVLRGLIT (S.iORF1/2)] 
| PepsCombined | F PepsCombined 
1817161514131211109 87 65432 10 0.00 0.25 (050 0.75 1.00 1817161514131211109 8 76543210 0.00 0.25 0.50 0.75 1.00 
# Predicted HLA class | alleles, %rank<=0.5 Estimated population coverage (EUR # Predicted HLA class | alleles, %Yrank<=0.5 Estimated population coverage (HIS 
pop g pop g 
E Predicted presentation across 92 HLA class | alleles 
Population: USA; peptides sorted by coverage 
& (40201 nun — 53507 Wë: ENS H | YLFDESGEFKL(nsp3) | EE << 
5 Bann B1801 Il Co701 ki mh |  FASEAARVV(nsp) -J EE — 
& E A2202 | 82705 [Wl co7o2 LM OM!) i= 
© Ben [51302 B co401 | ` DP SWSKVVKV (nsp15) - [= 
S EN - IROEEVQEL(ORF7a) 1 Fe: 
9 aan [H B4002 [i c0602 FGDDTVIEV (nsp3) 
E ENS ` L | EE — 
S [soz baam [i c0304 E r 
3 ei | ` L FAVDAAKAY (nsp10) 4 E= 
2 [H acsoz | B5201 E co501 WS  ELPDEFVVVTV (ORFOb)- Fe 
S Eso! | B3503 [i] C0303 B CL YLNSTNVTI (nsp3)  - Ps 
= |.1A0206 | |B1503 | |C1601 | BÀ L SLEDKAFQL(ORF9b) | Ey 
S GLITLSYHL (S.iORF1/2) | == 
e |) A0205 | B4501 | C0202 | Bg (Si ) 
j EE 
2 |./40202 | B5001 |. 1203 EU TAQNSVRVL (Hp) 
Bo Rm ` B4201 ` coso2 hz LEDKAFQL (ORF9b) 9 eg 
v DE |  LATNNLVVM (nsp2) - Mess — 
®© an B5802 | C0102 = | APHGHVMVEL (nspi) - == 
9 A0203 B5601  C1502 m - NATNVVIKV (S) j| p 
o A0211 B4601 C1701 EB - KNIDGYFKIY (S) m 
$ A0204 B1517 C1202 B oof TTTIKPVTY (nsp3) 1 === 
= [0702 B4006 ` Coen EB m Mead |— 
J B0801 B1502 C0704 EH | DEFVVVTV Kee | ——— 
J B4402 B1301 — Cos02 WS || KAFQLTPIAV (ORFOb) -| Bez 
[i B3501 B5401 C0403 E EI VATSRTLSY (M) 4 ENS — 
[i B4403 ` B5502| unknown 8 |  TVIEVQGY (nsp3) | E= 
[B5101 B0704 E EIKESVQTF (nsp2) - Ege 
j EILDITPCSF(S) | Pë 
| STSAFVETV (nsp2) 4 RR 
BS ^  ouPTWRVY(S) | = 
B | gPbErFvvv(oRF9b) | E 
B - AGTDTTITV(nsp5) 4i 
F NLNESLIDL (S) J Fees 
F | VGYLQPRTF(S) | 
| EILDITPCSFG (S) 4 
e APRITFGGP (N) 4 
| MLLGSMLYM (S.iORF1) 4 
FGPMVLRGLIT (S.iORF 1/2) 
[ PepsCombined | 
1817161514131211109876543210 0.00 0.25 0.50 0.75 1.00 
# Predicted HLA class | alleles, %Yrank<=0.5 Estimated population coverage (USA) 


Figure S5. Population coverage estimates of LC-MS/MS-identified SARS-CoV-2 HLA-I peptides, related to Figure 7 

HLAthena predictions for 92 HLA-I alleles using percentile rank cutoff values of 0.1, 0.5, 1, and 2% were used to show the number of alleles and estimated 
coverage for each LC-MS/MS-observed SARS-CoV-2 peptides across (A) AFA, (B) API, (C) EUR, (D) HIS, and (E) USA populations. Alleles are colored and 
ordered according to loci and the corresponding population frequency (high to low color intensity). Peptides are ordered according to their estimated coverage at 
Yrank cutoff of 0.5. 


Cell © CelPress 


OPEN ACCESS 












A 
SEFSSLPSY (nsp8) 100 100 100 d 
YLNSTNVTI (nsp3) 100 100 100 
EEFEPSTQYEY (nsp3) 100 100 100 


Wiel ee 


LATNNLVVM (nsp2 


m 
E 
[s] 
= 
E 
Ei 


STSAFVETV (nsp2 


) 
) 
) 
TAQNSVRVL (nsp2) 
) 
) 


E 
Slog 
cce e 
Di 
o 
E? 
= 
o 
[-] 











EIKESVQTF (nsp2 100 100 

vow ` —— — B MET 
- 7| l 
FAVDAAKAY (nsp10) 100 100 100 90 , e I B I 
APHQIANEL eg ! D80A (I20L mut in S.iORF1) i 
GLITLSYHL (S.iORF1/2) 100 I I 
GPMVLRGLIT (S.iORF1/2) 90 Z i RefSeq MLLGSMLYMS RGLIT VFILLPLRSLT* 40 ! 
TEE w Se m l MLLGSMLYMS RGL+T VFILLPLRSLT* l 

* B.1.351 MLLGSMLYMS RGLLT VFILLPLRSLT* 21853 

bi l 
EILDITPCSFG (S) 100 wë : l 
NLNESLIDL (S) woo ECKE ` 0 0 o gg gg 
HADQLTPTW (S) 100 H Be 

VGYLQPRTF (S) 100 LC NENNEN 
r C i 
EILDITPCSF (S) 100 I 
DEFVVVTV (ORF9b) 100 | Del 69-70 (Del 8-9 in S.iORF 1) | 
ELPDEFVVV (ORF9b) 100 100 X l RefSeq | MLLGSMLYMBLGPMVLRGLITLSYHLMMVFILLPLRSLT* 40 
SLEDKAFQL (ORF9b) mS I SBLGPMVLRGLITLSYHLMMVFILLPLRSLT* I 
LEDKAFQL (ORF9b) E T B. B.1.1.7 | MLLGSML--BLGPMVLRGLITLSYHLMMVFILLPLRSLT* 21794 | 
IRQEEVQEL (ORF7a) 100 100 WC I 
— RE RR ania ait nie eae eee i 

) 


100 100 
B.1.1.7 RI B.1.351 


VATSRTLSY (M 
- 75 


Figure S6. HLA-I peptide sequences in B.1.1.7, P.1, and B.1.351 SARS-CoV-2 variants, related to Figure 6 

(A) The sequence of the HLA-I peptides detected in our study were used as tblastn queries against a database containing early representative genomes of SARS- 
CoV-2 lineages with the pango designations B.1.1.7 (29 genomes), P.1 (14 genomes), and B.1.351 (23 genomes); see GISAID acknowledgment table for ac- 
cessions (Table S8). Identity scores for each peptide in each variant are shown in the heatmap. (B,C) Mutations in the S.iORF1/2 region of B.1.351 (B) and B.1.1.7 
(C) variants in comparison to the SARS-CoV-2 RefSeq sequence NC. 045512.2 isolated from Wuhan. The position of the three HLA-I peptides is indicated. 


